Building the full ML pipeline
Kubeflow, Airflow, and the end-to-end MLOps system that runs itself
Ravi was doing experiment tracking. Meera was doing model serving. Kavya was doing monitoring. Each person had their own scripts, their own tools, their own way of doing things.
When the company decided to launch five new ML models in Q3, the chaos became visible. Nobody could reproduce each other's work. Deploying a new model took two weeks of coordination.
"We need a proper MLOps platform," the head of engineering said.
What is an ML pipeline?
A pipeline connects all the steps of the ML lifecycle into an automated, reproducible flow:
```
Data ingestion → Data validation → Feature engineering →
Model training → Model evaluation → Model registration →
Model deployment → Monitoring → [Retrain if drift] → loop
```
Each step is a node. Data flows between them. The whole thing can be triggered automatically — on a schedule, on new data arrival, or on drift detection.
Apache Airflow — the scheduler
Airflow is a workflow orchestrator. You write pipelines as Python code called DAGs (Directed Acyclic Graphs). Airflow schedules and monitors them.
```python
from airflow import DAG
from airflow.operators.python import PythonOperator
from datetime import datetime, timedelta
def ingest_data():
# Pull new data from database
passdef train_model():
# Run training with MLflow tracking
passdef evaluate_and_promote():
# Compare new model vs production
# Promote if better
passwith DAG(
'student_dropout_weekly_retrain',
schedule_interval='@weekly',
start_date=datetime(2025, 1, 1),) as dag:
ingest = PythonOperator(task_id='ingest_data', python_callable=ingest_data)
train = PythonOperator(task_id='train_model', python_callable=train_model)
evaluate = PythonOperator(task_id='evaluate', python_callable=evaluate_and_promote)# Define the order
ingest >> train >> evaluate```
Every Sunday at midnight, this DAG runs automatically: ingests new data, retrains, evaluates, promotes if better.
Kubeflow — ML pipelines on Kubernetes
For larger scale, Kubeflow runs ML pipelines as Kubernetes jobs. Each step runs in its own container, with its own resources, tracked automatically.
```python
import kfp
from kfp import dsl
@dsl.component(base_image='python:3.11')
def train_step(data_path: str) -> str:
# Training logic here
return "model saved at gs://bucket/model.pkl"@dsl.component(base_image='python:3.11')
def deploy_step(model_path: str):
# Deployment logic here
pass@dsl.pipeline(name='loan-approval-pipeline')
def loan_pipeline(data_path: str):
train_task = train_step(data_path=data_path)
deploy_step(model_path=train_task.output)Compile and submit
kfp.compiler.Compiler().compile(loan_pipeline, 'pipeline.yaml')
```
The full stack Ravi and Meera built
```
Weekly trigger (Airflow)
↓Data ingestion from PostgreSQL
↓Data validation (Great Expectations — checks data quality)
↓Feature engineering (same code used in training + serving)
↓Model training (logged to MLflow)
↓Evaluation: new model vs current production model
↓If new model wins → push to model registry (MLflow)
↓Trigger deployment pipeline (GitHub Actions)
↓Build new Docker image with model baked in
↓Deploy to Kubernetes (blue-green deployment)
↓Run smoke tests against new deployment
↓If smoke tests pass → switch traffic to new model
↓Monitor with Evidently → alert if drift
↓Loop back to weekly trigger
```
This whole system runs itself. The team reviews MLflow dashboards once a week. Deployments happen automatically. Models stay fresh.
Five new models in Q3 were deployed in two days each instead of two weeks.
ML pipelines automate the full lifecycle: ingest → train → evaluate → deploy → monitor
Airflow schedules and orchestrates pipeline steps with dependency management
Kubeflow runs ML pipelines as Kubernetes jobs for larger scale
The goal is automation: models should retrain and redeploy with minimal human intervention