What is Apache Airflow?
Apache Airflow is the standard open-source platform for orchestrating complex data and ML workflows as DAGs (Directed Acyclic Graphs). Schedule pipelines, handle dependencies, monitor execution, and retry failures automatically. Used by Netflix, Airbnb, Twitter, and most major Indian tech companies.
Quick start
1
pip install apache-airflow 2
airflow standalone # Dev setup 3
Open http://localhost:8080
4
Write DAGs as Python files in ~/airflow/dags/
Use cases
→ML pipeline automation
→Data engineering
→ETL workflows
→Model retraining schedules
Compatible models
Framework agnostic — orchestrates any Python code
Why this matters for India
// india context
Most senior data engineering and MLOps roles in India require Airflow. Core skill for the ₹20L+ salary band.