Big Data Technology Stack
SUMMARY
- Client
- Raiffeisenbank
- Tech stack
- OpenAI API, Profinit DATA_FRAME Automation tool, Microsoft Active Directory, Cloudera, Apache Spark, Hadoop, MS Azure
Our long term client Raiffeisenbank CZ needed a complex delivery of Hadoop platform to perform analytical business use cases across large amounts of transactional data. We implemented an end-to-end Hadoop solution that enables the bank to process up to billions of records daily.
Results
Project Background
Our long term client Raiffeisenbank CZ was looking for the complex delivery of a big data Hadoop platform to enable the bank to perform analytical business use cases across large amounts of transactional data. Computations of such massive volume are not possible to achieve by standard DWH capacity. For this reason, a brand new parallel processing big data platform had to be built from scratch. Tools for solving business cases with data science – and their implementation into the client’s environment – were needed as a part of the solution.
Profinit came up with analytical use cases and began designing and implementing a complete, end-to-end analytical solution. Together with bank data in-house specialists, the Profinit team selected the suitable hardware, sizing and the right variant of Hadoop distribution. The Profinit DATA_FRAME Automation tool was used for the fast design of the architecture and implementation.
Challenge
The major challenge was to create a blueprint solution, as this was the client’s very first implementation of a big data platform. It was important to solve all architectural and security compliance requirements. Reliable data anonymisation was essential. The client also requested single sign-on authentication and related integration with Active Directory. The whole platform needed to work independently outside the internet, which resulted in offline storage for OS, Hadoop, and data science tools packages.
Business Needs
The solution needed to meet the following specifications:
- Select and build a highly efficient big data processing platform
- Set up tools for solving business cases with data science
- Meet strict requirements on system security and data anonymisation
- Implement a single sign-on feature and the integration with IBM Cognos and MS Active Directory
Solution & Results
From the very beginning, we approached the task intending to deliver an end-to-end solution. After defining business data analytics use cases, we focused on choosing and designing the most suitable platform to achieve the business goals. In the initial analytical phase, we collected detailed requirements and specifications. According to the analysis, we chose the most suitable Hadoop distribution together with optimal sizing and hardware configuration.
Optimal spec and full security compliance
We optimised CPU, memory, and storage sizing to achieve balanced performance and effectiveness. The architecture fulfilled all requirements including security compliance, single sign-on access, and integrations. After installation, we implemented two data models for defined business use cases. Thanks to our DATA_FRAME Automation tool, approximately 95% of code was generated automatically, enabling the bank to process billions of records daily.
HEAR FROM THE CLIENT
Profinit convinced us with the end-to-end delivery approach to building a big data platform to solve specifically defined business cases.