Project:

SIXT 

How to build an infrastructure for big data analytics

Challenge

SIXT’s BI division has witnessed a steady increase in the amount of data dedicated to dashboard reports and analytics. From time to time, the amount of data exceeded the limits of the hardware and software used for capturing and transforming the data. Addressing these constraints has been time-consuming, complicated, and expensive, and a continuous process, as the amount of data has grown steadily. In times of trouble, the infrastructure was not able to process data from different sets fast enough to perform near-real-time analysis. Without current analysis, it was impossible to improve the business model and analyse customers or offers continually.

Solution

With the help of the RESULT team, SIXT decided to migrate to the AWS cloud and deploy infrastructure to deliver significantly more than on-premises infrastructure.

We implemented data capture from both relational databases and event-based data capture and integrated this data, which was not previously possible. We have provided end-users with near real-time analytics and data capture from a central data warehouse in various ways (APIs, SQL query, and others).

The new infrastructure also enabled self-service analytics, which gave business departments complete freedom to design their dashboards, was able to analyse data in near real-time and connect them to data from local platforms, which was also not possible before.

TECHNOLOGIES USED

| Apache Airflow | Apache Kafka | Apache Nifi | API Gateway | Athena | AWS services | Bitbucket pipelines | Docker | DynamoDB | EC2 | ECR | ECS | Glue | Java | Kinesis | Lambda | Python | Quicksight | S3 | Terraform | VPC |

Result

The new infrastructure and approach have enabled faster and better quality adaptation to new requirements and the processing of vast amounts of data in a short time. If the CRM department had previously needed 17 hours to calculate their data, now the BI department was able to complete it within 30 minutes. They can make calculations multiple times a day, allowing for a faster and more efficient adjustment of the business model.

With cloud infrastructure, users have more access to data fully customised to their needs.