Deploying Meera's chatbot
Model serving with FastAPI and Docker — from notebook to real API
Meera's dropout prediction model had been running in a Jupyter notebook for three months. Every Monday morning, she'd manually run the notebook, download a CSV, email it to the teachers team. 47 clicks, 20 minutes, every week.
"Can you make this automatic?" Kiran asked. "Like, teachers should see predictions in their dashboard in real-time."
This required turning a notebook into an API. This is where data science ends and ML engineering begins.
The notebook-to-production gap
A Jupyter notebook is great for exploration. It's terrible for production:
- You can't call a notebook from another service
- It runs cells in whatever order you last ran them
- It doesn't handle concurrent requests
- It crashes with no recovery
You need to wrap your model in a web API — a service that accepts requests and returns predictions.
FastAPI — the modern way to serve ML models
```python
app.py
from fastapi import FastAPI
from pydantic import BaseModel
import joblib
import numpy as np
Load the trained model at startup
model = joblib.load("dropout_model.pkl")
app = FastAPI(title="Student Dropout Prediction API")
Define the input shape
class StudentData(BaseModel):
grades: float
attendance: float
engagement_score: float
assignments_completed: intDefine the prediction endpoint
@app.post("/predict")
def predict_dropout(student: StudentData):
features = np.array([[
student.grades,
student.attendance,
student.engagement_score,
student.assignments_completed
]])probability = model.predict_proba(features)[0][1]
prediction = "at_risk" if probability > 0.6 else "on_track"return {
"student_status": prediction,
"dropout_probability": round(float(probability), 3),
"confidence": "high" if probability > 0.8 or probability < 0.2 else "medium"
}@app.get("/health")
def health():
return {"status": "healthy"}```
```bash
Run locally
uvicorn app:app --reload
Test it
curl -X POST http://localhost:8000/predict \
-H "Content-Type: application/json" \
-d '{"grades": 62, "attendance": 0.6, "engagement_score": 3.2, "assignments_completed": 8}'
```
Response: `{"student_status": "at_risk", "dropout_probability": 0.74, "confidence": "high"}`
The teachers dashboard calls this API for every student. Real-time, automatic, no Monday morning ritual.
Packaging it with Docker
```dockerfile
FROM python:3.11-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install fastapi uvicorn scikit-learn joblib numpy
COPY dropout_model.pkl .
COPY app.py .
EXPOSE 8000
CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "8000"]
```
```bash
docker build -t dropout-predictor .
docker run -p 8000:8000 dropout-predictor
```
Now the model runs in a container. It can be deployed anywhere — a cloud server, Kubernetes, a serverless platform. The 47-click Monday ritual became a 200ms API call that happens automatically.
What about heavier models like LLMs?
For smaller models (scikit-learn, XGBoost), FastAPI + Docker is perfect. For larger models (PyTorch, TensorFlow, LLMs), you'd use dedicated serving frameworks: TorchServe, TF Serving, or Triton Inference Server. But the concept is the same — wrap the model in an API, put it in a container.
Notebooks are for exploration; production needs a proper API
FastAPI is the fastest way to turn a model into an HTTP endpoint
Always add a /health endpoint — used by load balancers and monitoring
Docker the whole thing so it runs identically in dev and production