Case study

Introducing Apache Airflow

Transforming Telecom Data Pipelines for Enhanced Efficiency and Scalability
Share:

Challenge

A telecom with mature Big Data and Warehousing divisions wants to improve the orchestration of its data pipelines. The goal is to migrate from in-house projects to a widely adopted and supported toolkit. Projects to migrate are big and already in production. The migration process would have to happen online.

Solutions

TantusData came to demonstrate and deliver the PoC of a migration process to Apache Airflow. This included not only a working demo of the functionality of the tool but also working with sysops and data teams to teach them how to use the Airflow by themselves. By working together with the client and involving them in all steps of the process, TD shows how to carry out the migration and understand all what and why topics that may occur.

Technology & Tools

Apache Airflow
CDP
Oracle
Cloudera
Apache Hadoop
TIG stack (Telegraf, InfluxDB, Grafana)

Client

An international telecommunication company based in Austria.

Opportunity

Using a well-known and supported tool instead of an in-house developed tool can free developers to let them work on other areas instead of reinventing the well-available wheel.
Ability to onboard more people into a standard tool – there is UI and documentation, which allows not only data engineers to understand the pipeline logic.
Hiring and introducing new people to a project using a well-known toolkit is easier and faster.

Delivery

The migration began with working with the infrastructure team to deploy the Apache Airflow tool, following the best practices already in place. TantusData delivered scripts to recreate the deployment and ensured that sysops would be familiar with the tool and understand what and how it is deployed and configured and what security concerns must be addressed.
Having a deployed tool in place, TantusData began working with the data team to migrate a part of the existing pipeline. It is to rewrite a current part of the code and use it as an example of migrating other parts. The data team also had a chance to try using the tool on their own.
The PoC was concluded by showcasing monitoring options for the tool to ease up daily maintenance. TantusData also presented recommended next steps in terms of tips for the development team and the administration and maintenance.

Effect

The client was able to evaluate the Airflow tool, how it fits their platform and what they would need to do to integrate it.
The migration plan has brought up a discussion on reviewing and refactoring the data pipeline. Being able to see a visualisation of all tasks gives everyone a broad view and encourages discussing engineering matters like the manageability of graphs of over 100 nodes.

More case studies

Case study

Failing silently.

Care for a case long closed.

learn more.
Case study

Convincing AI to help us make more money.

Creating a ML model to improve the conversion rate for property bookings.

learn more.
Imaga_recognition_automation
Case study

Image recognition done differently.

Implementing a solution for extracting detailed data from scanned invoices.

learn more.