Member-only story

Orchestrating DBT with Airflow

Make an easy management data pipeline

Chunting Wu
6 min readFeb 20, 2023
Photo by Larisa Birta on Unsplash

Airflow is one of the great orchestrator today. It is a highly scalable and available framework for orchestrating, and more importantly, it is developed in Python. Therefore, developers can develop in Python based on the various operators provided by Airflow. In addition, the Airflow community is quite active, and many operators are constantly added to the ecosystem.

These advantages make Airflow the preferred choice for data engineers working on ETL or ELT, although there are many other frameworks for orchestrating workflows, such as Argo workflow, but for engineers who rely heavily on Python development, Airflow is much easier to harness and maintain.

But Airflow is not without drawbacks, one of the biggest problems is it is a monolith. When workflows are continuously added, this monolith will sooner or later become a big ball of mud. Another problem is that Airflow is a distributed framework and it is not easy to verify a complete workflow in local development. There are many tools that improve on Airflow’s shortcomings, such as Dagster, but the fact of the monolith remains unresolved.

Although Airflow has been widely used in a diversity of data engineering domains, the role of the data domain has been further differentiated as the domain has become…

--

--

Chunting Wu
Chunting Wu

Written by Chunting Wu

Architect at SHOPLINE. Experienced in system design, backend development, and data engineering.

No responses yet