Learn 🧠 All Concepts (20) 🤖 What is an LLM? 📚 RAG Explained ⚡ AI Agents 💻 Run AI Locally 🇮🇳 AI in India 📖 Learn Tracks 🔧 DevOps Track ⚙️ AI Ops Track 🗺️ AI Engineer Roadmap
Tools 🔧 AI Tools Directory 🔓 Open Source AI ⭐ Top GitHub Repos ✦ Claude Skill Repos 🚀 Ready-to-Deploy Projects
Build 🏗️ Build Hub 🎯 Master Prompts 🧩 RAG Agents 🚀 App Megaprompts
Workflows ⚡ All Workflows (22) 🎥 Text to Video 🎞️ Image to Video 🔊 Text to Speech ♻️ Automation
Resources 🧪 Colab Notebooks ⚙️ n8n Workflows 📈 Algo Trading 💰 Passive Income
🗂️ Browse All Topics About AItheGuru
← All concepts
🗄️

Vector Databases

The storage layer that makes RAG and semantic search possible

Intermediate 5 min read

Why regular databases cannot do semantic search

A regular SQL database can tell you: "find all rows where column = value." It matches exactly.

But "find me documents about retirement planning" cannot be answered by exact matching — the word "retirement" might not appear in the most relevant documents. You need semantic matching: find documents with similar meaning.

This requires comparing meaning numerically — which requires embeddings (vectors). And searching through millions of vectors efficiently requires a specialised database designed for exactly that operation.

How vector databases work

Every piece of content (document, image, product description) is converted to a vector (list of numbers) using an embedding model. These vectors are stored in the database alongside the original content.

At query time: 1. Your question is also converted to a vector 2. The database finds the stored vectors closest to your query vector (cosine similarity or L2 distance) 3. Returns the matching content

The key challenge is doing this efficiently across millions or billions of vectors. Algorithms like HNSW (Hierarchical Navigable Small World) and IVF (Inverted File Index) create index structures that make approximate nearest-neighbour search fast enough for production.

Choosing a vector database

ChromaDB: Best for prototyping and small projects. Runs in-process (no server). Free and open-source. Limited to millions of vectors.

Pinecone: Managed cloud service. Easy to start, scales well, good Python SDK. Paid beyond the free tier. Good for production apps.

Qdrant: Open-source, self-hostable, written in Rust for performance. Best balance of scale and control. Free to self-host.

pgvector (Supabase): PostgreSQL extension. If you already use Postgres, this avoids a separate service. Good for moderate scale.

Weaviate / Milvus: Enterprise-grade, billion-scale. Significant operational complexity.

Rule of thumb: Start with ChromaDB for prototyping. Graduate to Qdrant or pgvector for production. Pinecone if you want managed infrastructure.