🏋️

Fine-tuning vs RAG vs Prompting

Choosing the right approach — when to use what and why

Intermediate 5 min read

1. Three ways to customise AI
2. Decision framework
3. Fine-tuning in practice

Three ways to customise AI

There are three main approaches to making an AI model do what you specifically need:

Prompting: Tell the model what to do in the prompt. Zero cost, immediate, but limited and not persistent.

RAG: Give the model access to specific documents at query time. Dynamic, updatable, great for knowledge.

Fine-tuning: Train the model further on your specific data. Changes the model itself. Expensive but powerful for style and behaviour.

Decision framework

Use prompting when: The task is clear, the model already knows how to do it, and you can express the requirements in a prompt. Start here — always.

Use RAG when: The model needs to know specific, private, or frequently changing information. Company documents, internal knowledge bases, real-time data.

Use fine-tuning when: You need consistent tone/style the model doesn't naturally have, specialised skills (medical, legal domain expertise), or you want to compress long system prompts into the model.

The typical path: Prompt engineer first → add RAG if knowledge is lacking → fine-tune only if prompt + RAG aren't enough.

Fine-tuning in practice

Fine-tuning requires labelled training examples (input-output pairs). A minimum of 50-100 examples to see improvement; 1,000+ for significant changes.

Cost: OpenAI fine-tuning of GPT-4o mini costs roughly $0.003/1K tokens for training. A 1,000-example dataset might cost $5-20 to train.

Good use cases: Customer service bot with your company's tone, domain-specific classification, formatting outputs in a very specific structure.

Bad use cases: Adding new knowledge the model doesn't have (use RAG), or fixing factual errors (the model will still hallucinate).

Keep learning

🧠

Fine-tuning vs RAG vs Prompting

Contents

Three ways to customise AI

Decision framework

Fine-tuning in practice

Keep learning

Large Language Models (LLMs)

RAG — Retrieval Augmented Generation

AI Agents