🗄️

Vector Databases

The storage layer that makes RAG and semantic search possible

Intermediate 5 min read

1. Why regular databases cannot do semantic search
2. How vector databases work
3. Choosing a vector database

Why regular databases cannot do semantic search

A regular SQL database can tell you: "find all rows where column = value." It matches exactly.

But "find me documents about retirement planning" cannot be answered by exact matching — the word "retirement" might not appear in the most relevant documents. You need semantic matching: find documents with similar meaning.

This requires comparing meaning numerically — which requires embeddings (vectors). And searching through millions of vectors efficiently requires a specialised database designed for exactly that operation.

How vector databases work

Every piece of content (document, image, product description) is converted to a vector (list of numbers) using an embedding model. These vectors are stored in the database alongside the original content.

At query time: 1. Your question is also converted to a vector 2. The database finds the stored vectors closest to your query vector (cosine similarity or L2 distance) 3. Returns the matching content

The key challenge is doing this efficiently across millions or billions of vectors. Algorithms like HNSW (Hierarchical Navigable Small World) and IVF (Inverted File Index) create index structures that make approximate nearest-neighbour search fast enough for production.

Choosing a vector database

ChromaDB: Best for prototyping and small projects. Runs in-process (no server). Free and open-source. Limited to millions of vectors.

Pinecone: Managed cloud service. Easy to start, scales well, good Python SDK. Paid beyond the free tier. Good for production apps.

Qdrant: Open-source, self-hostable, written in Rust for performance. Best balance of scale and control. Free to self-host.

pgvector (Supabase): PostgreSQL extension. If you already use Postgres, this avoids a separate service. Good for moderate scale.

Weaviate / Milvus: Enterprise-grade, billion-scale. Significant operational complexity.

Rule of thumb: Start with ChromaDB for prototyping. Graduate to Qdrant or pgvector for production. Pinecone if you want managed infrastructure.

Keep learning

🧠

Vector Databases

Contents

Why regular databases cannot do semantic search

How vector databases work

Choosing a vector database

Keep learning

Large Language Models (LLMs)

RAG — Retrieval Augmented Generation

AI Agents