Learn › AI Ops › Meera trains her first model

AI Ops Ch 1 / 9 Beginner

🤖

Meera trains her first model

What is machine learning really — and what happens after training?

⏱ 10 min 4 commands 4 takeaways

🤖

In this chapter

Meera

Data science intern at a Chennai edtech

The story

Meera joined an edtech startup in Chennai after her MSc in statistics. On her first day, her manager Kiran handed her a CSV file with 10,000 student records — grades, attendance, engagement scores — and said: "Build a model that predicts which students will drop out."

Meera opened the laptop. She'd studied ML theory for two years. She'd never actually shipped anything.

This is where most ML education ends and real work begins.

What is machine learning, really?

A normal program follows explicit rules you write. "If score < 40, flag as at-risk." You define the logic.

Machine learning flips this. You give the program examples — students who dropped out, students who didn't — and it figures out the patterns itself. The rules emerge from the data.

The output is called a model — a mathematical function that takes inputs (grades, attendance) and produces a prediction (drop out: yes/no, probability: 73%).

The three phases Meera learned

Phase 1: Training — Show the model thousands of examples. It adjusts its internal parameters to get better at predicting. This is computationally expensive. Happens once (or periodically).

Phase 2: Evaluation — Test the model on examples it hasn't seen. Does it actually predict correctly? This tells you if it learned real patterns or just memorised the training data.

Phase 3: Inference — Use the trained model to make predictions on new data. A student logs in → model predicts dropout probability → teacher gets an alert. This happens millions of times, must be fast and cheap.

The gap between Phase 1 and Phase 3 is where most ML projects die.

Meera's first model

```python

import pandas as pd

from sklearn.ensemble import RandomForestClassifier

from sklearn.model_selection import train_test_split

from sklearn.metrics import classification_report

Load data

df = pd.read_csv('students.csv')

Features (inputs) and target (what we're predicting)

X = df[['grades', 'attendance', 'engagement_score', 'assignments_completed']]

y = df['dropped_out'] # 0 = stayed, 1 = dropped out

Split: 80% for training, 20% for testing

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

Train the model

model = RandomForestClassifier(n_estimators=100)

model.fit(X_train, y_train)

Evaluate

predictions = model.predict(X_test)

print(classification_report(y_test, predictions))

```

Output: 84% accuracy. Meera was excited. Kiran was cautious.

"Accuracy is not enough," Kiran said. "What's the false negative rate? If we miss a student who drops out, that's worse than a false alarm. Show me precision and recall."

Meera learned that metrics matter as much as the model.

Training vs Inference — the forgotten distinction

Training: runs on a powerful GPU, takes hours or days, happens infrequently.

Inference: runs on a CPU or small GPU, must return in milliseconds, happens constantly.

A model that trains in 4 hours on a ₹10,000/month GPU server needs to run inference in 50ms on a ₹500/month server. These are completely different engineering problems.

This is what AI Ops is about — the engineering that bridges them.

Key takeaways

ML learns patterns from examples instead of following explicit rules

Three phases: Training (learn), Evaluation (test), Inference (use)

Accuracy alone is not enough — understand precision, recall, F1 for your use case

Training and inference have completely different infrastructure requirements

Commands from this chapter

$ pip install scikit-learn pandas

Install core ML libraries

$ model.fit(X_train, y_train)

Train a model on data

$ model.predict(X_test)

Make predictions on new data

$ model.score(X_test, y_test)

Get accuracy score