Learn › AI Ops › The model that went wrong silently

AI Ops Ch 4 / 9 Advanced

📉

The model that went wrong silently

Model monitoring, data drift, and why your model degrades without you knowing

⏱ 14 min 2 commands 4 takeaways

📉

In this chapter

Kavya

MLOps engineer at a Bengaluru fintech

The story

Kavya had deployed a loan approval model six months ago. It was 88% accurate at launch. Everyone was happy. The model ran quietly in production, approving and rejecting loan applications 24/7.

Three months later, the business team noticed something odd. Approval rates had dropped from 65% to 41% with no change in applicant quality. Revenue was down 23%.

Kavya checked the model. It was still running. No errors. No crashes. The model was doing exactly what it was trained to do.

The problem was that the world had changed. The model hadn't.

What is model drift?

When you train a model, it learns patterns from historical data. But data changes over time:

- Customer behaviour shifts

- Economic conditions change

- New types of fraud appear

- Your product evolves, attracting different users

When the real-world data no longer looks like training data, the model's predictions become unreliable. This is drift. It's silent, gradual, and dangerous.

Two types of drift

Data drift (input drift): The distribution of incoming features changes. Average applicant age shifted from 28 to 35. Income distributions changed after a recession. The model was never trained on this profile.

Concept drift: The relationship between inputs and outputs changes. Six months ago, employment type X was a strong predictor of repayment. Then a policy changed and it's no longer relevant.

How to detect it

```python

from evidently.report import Report

from evidently.metric_preset import DataDriftPreset, TargetDriftPreset

import pandas as pd

Load reference data (training time) and current data (production)

reference_data = pd.read_csv("training_data.csv")

current_data = pd.read_csv("last_month_production.csv")

Generate drift report

report = Report(metrics=[DataDriftPreset()])

report.run(reference_data=reference_data, current_data=current_data)

report.save_html("drift_report.html")

```

Evidently generates a visual HTML report showing which features have drifted and by how much. Kavya ran this and saw that `employment_months` had drifted significantly — the distribution had shifted because of layoffs in the IT sector.

Setting up continuous monitoring

```python

import mlflow

from evidently.metrics import DatasetDriftMetric

def monitor_production_data(reference_df, production_df):

report = Report(metrics=[
    DatasetDriftMetric(),
])
report.run(reference_data=reference_df, current_data=production_df)
result = report.as_dict()

drift_detected = result["metrics"][0]["result"]["dataset_drift"]
drift_share = result["metrics"][0]["result"]["drift_share"]

# Log to MLflow
with mlflow.start_run(run_name="monitoring"):
    mlflow.log_metric("drift_share", drift_share)
    mlflow.log_metric("drift_detected", int(drift_detected))

if drift_detected and drift_share > 0.3:
    send_alert("Significant data drift detected — model retraining needed")

return drift_detected

```

The retraining pipeline

The solution to drift is not a one-time fix — it's a process:

1. Monitor: Check drift weekly (or daily for high-stakes models)

2. Alert: Automated alerts when drift crosses a threshold

3. Retrain: Trigger retraining with recent data

4. Validate: Run the new model through evaluation

5. Deploy: If new model is better, promote it. If not, investigate.

This loop is the core of MLOps. The model is not a one-time deliverable. It's a living system.

What Kavya built

An Airflow DAG that runs every Sunday at 2am: collect last week's production data → run drift analysis → if drift detected, trigger retraining → compare new vs old model → promote automatically if new model is 2%+ better.

The loan approval model has been retrained 8 times in the year since. Each time, accuracy is checked before promotion. The business team stopped seeing unexplained revenue drops.

Kavya now calls it her "self-healing model".

Key takeaways

Models degrade silently as the real world changes — this is called drift

Data drift = input distributions change; Concept drift = relationships change

Evidently and WhyLogs are popular tools for drift detection

Build a retraining loop: monitor → alert → retrain → validate → deploy

Commands from this chapter

$ pip install evidently

Install drift detection library

$ report.save_html("report.html")

Generate visual drift report