Learning Objectives
- Understand what MongoDB Atlas Vector Search is and how it integrates vector search into MongoDB's document model
- Identify the core features: HNSW indexing, the
$vectorSearchaggregation stage, and metadata filtering - Evaluate when Atlas Vector Search is the right choice vs. purpose-built vector databases
What Is MongoDB Atlas Vector Search?
MongoDB Atlas Vector Search is a native vector similarity search capability built into MongoDB Atlas — MongoDB's fully managed cloud database service. Announced in 2023 and generally available in 2024, it allows MongoDB users to store vector embeddings alongside their regular MongoDB documents and perform semantic similarity search using the familiar MongoDB query interface.
The core value proposition is the same as Supabase Vector: if your team already uses MongoDB as your primary database, Atlas Vector Search means you can add RAG and semantic search to your application without adopting a separate vector database. Embeddings live in the same Atlas cluster as your application data, queried via a new $vectorSearch aggregation stage.
✅Tip
Try Atlas Vector Search: Create a free MongoDB Atlas M0 cluster at mongodb.com/atlas — the free tier includes Atlas Vector Search. Create a vector search index from the Atlas UI and use $vectorSearch in your aggregation pipelines. Available in all Atlas regions.
How It Works
Storing Embeddings in MongoDB Documents
Embeddings are stored as arrays in MongoDB documents:
// Insert a document with its embedding
await collection.insertOne({
_id: new ObjectId(),
title: "Introduction to Vector Search",
content: "Vector search enables semantic similarity retrieval...",
category: "ai",
author: "Jane Smith",
published: new Date("2026-01-15"),
embedding: [0.0231, -0.0412, 0.1834, ...] // 1536-dim array
})
Standard MongoDB document structure — no schema changes required beyond adding the embedding array field.
Creating a Vector Search Index
Unlike traditional MongoDB indexes, vector search indexes are created through the Atlas UI or the Atlas API:
{
"fields": [
{
"type": "vector",
"path": "embedding",
"numDimensions": 1536,
"similarity": "cosine"
},
{
"type": "filter",
"path": "category"
},
{
"type": "filter",
"path": "published"
}
]
}
The index supports cosine, euclidean, and dot product similarity metrics. Filter fields enable pre-filtering before vector search.
The $vectorSearch Aggregation Stage
Queries use a new aggregation pipeline stage:
const results = await collection.aggregate([
{
$vectorSearch: {
index: "vector_index",
path: "embedding",
queryVector: queryEmbedding, // your query converted to a vector
numCandidates: 100, // HNSW candidates to consider
limit: 5, // top results to return
filter: {
category: "ai",
published: { $gte: new Date("2026-01-01") }
}
}
},
{
$project: {
title: 1,
content: 1,
score: { $meta: "vectorSearchScore" }
}
}
]).toArray()
Results include a vectorSearchScore field (0–1, higher = more similar) via $meta: "vectorSearchScore".
💡Key Concept
Pre-filtering vs. post-filtering: Atlas Vector Search supports pre-filtering — applying metadata filters before the HNSW search, not after. This is critical for correctness: post-filtering (search all vectors, then discard non-matching results) can return fewer than limit results because some of the nearest vectors get filtered out. Pre-filtering ensures you get limit results that all satisfy the filter condition.
HNSW Index
Atlas Vector Search uses HNSW (Hierarchical Navigable Small World) graphs for approximate nearest neighbor search:
- Among the fastest algorithms for high-dimensional vector search
- Sub-millisecond query times for most datasets
numCandidatesparameter controls the recall/speed tradeoff (higher = more accurate, slower)- Supports billions of vectors with distributed Atlas clusters
Integration with MongoDB Ecosystem
Atlas Vector Search works naturally with the MongoDB ecosystem:
- Atlas App Services: Trigger embedding generation automatically on document insert/update
- Atlas Charts: Visualize vector search results and analytics
- Compass (GUI): Explore and test vector search queries in the MongoDB Compass desktop app
- Change Streams: React to document changes and update embeddings in real time
- Aggregation pipelines: Combine
$vectorSearchwith any other pipeline stages —$match,$lookup,$group
LangChain and LlamaIndex Integration
Both major AI frameworks support Atlas Vector Search:
# LangChain
from langchain_mongodb import MongoDBAtlasVectorSearch
from langchain_openai import OpenAIEmbeddings
vectorstore = MongoDBAtlasVectorSearch(
collection=collection,
embedding=OpenAIEmbeddings(),
index_name="vector_index",
text_key="content"
)
retriever = vectorstore.as_retriever(search_kwargs={"k": 5})
Pricing
Atlas Vector Search is included in all MongoDB Atlas paid tiers:
- 512MB storage
- Vector Search available
- Ideal for development
- 10GB storage
- Dedicated cluster
- Production-ready
- 40GB storage
- Higher throughput
- Global clusters
- Pay per operation
- Auto-scales
- Good for variable workloads
Strengths
- No separate service: If you use MongoDB Atlas, vector search is already available — no new infrastructure
- Document model: Embeddings live alongside all document fields — rich metadata available for filtering
- Pre-filtering: Accurate metadata filtering before HNSW search ensures correct result counts
- HNSW performance: Fast approximate search suitable for production workloads
- Familiar MongoDB operations:
$vectorSearchin aggregation pipelines — same query interface as the rest of MongoDB - Atlas ecosystem: Integrates with App Services, Charts, Change Streams, and Compass
- Global clusters: Multi-region deployment for low-latency queries worldwide
Limitations & Considerations
- Atlas-only: Only available in MongoDB Atlas (managed cloud); not available for self-hosted MongoDB Community Edition
- Embedding generation not included: You must generate embeddings externally (OpenAI, Cohere, etc.) — Atlas handles storage and search, not embedding generation
- Index management via Atlas UI or API: Cannot create vector search indexes via
mongoshor standard MongoDB drivers — must use Atlas-specific tooling - Cost at scale: Atlas dedicated clusters are more expensive than Supabase or open-source self-hosted options for budget-constrained teams
- Newer feature: Less community tooling and tutorials than Pinecone; documentation still maturing
Best Use Cases
| Task | Why Atlas Vector Search |
|---|---|
| Existing MongoDB Atlas applications | Add semantic search without new infrastructure |
| Rich document metadata filtering | Complex filter conditions on document fields pre-search |
| Real-time RAG with change streams | Update embeddings automatically as documents change |
| Multi-region applications | Atlas global clusters for geographically distributed search |
| MongoDB-native developer teams | No context switch — same driver, same query patterns |
When to choose alternatives:
- Don't use MongoDB → Supabase Vector (PostgreSQL) or Pinecone
- Need to self-host → Qdrant or Weaviate
- Local development only → Chroma
- Need 1 billion+ vectors at lowest cost → Pinecone serverless
Getting Started
from pymongo import MongoClient
from openai import OpenAI
client = MongoClient("mongodb+srv://...")
db = client["my_database"]
collection = db["documents"]
openai_client = OpenAI()
def get_embedding(text: str) -> list[float]:
return openai_client.embeddings.create(
input=text,
model="text-embedding-3-small"
).data[0].embedding
# Insert a document with embedding
doc = {
"title": "Getting started with MongoDB Vector Search",
"content": "MongoDB Atlas Vector Search enables semantic similarity queries...",
"category": "database",
"embedding": get_embedding("Getting started with MongoDB Vector Search")
}
collection.insert_one(doc)
# Semantic search (after creating vector search index in Atlas UI)
query_embedding = get_embedding("How do I add vector search to MongoDB?")
results = collection.aggregate([
{
"$vectorSearch": {
"index": "vector_index",
"path": "embedding",
"queryVector": query_embedding,
"numCandidates": 50,
"limit": 5
}
},
{"$project": {"title": 1, "score": {"$meta": "vectorSearchScore"}}}
])
for doc in results:
print(doc)
✅Tip
For MongoDB teams: If your application already runs on MongoDB Atlas, Atlas Vector Search is the fastest path to adding RAG — no new service, no new SDK, no data synchronization. Create the vector search index in Atlas UI (takes under a minute), add an embedding field to your documents, and your existing aggregation pipelines can now include $vectorSearch. The ability to combine vector similarity with MongoDB's rich query expressions ($lookup, $match, $group) makes it more powerful than many standalone vector databases for complex retrieval scenarios.
Key Takeaways
- MongoDB Atlas Vector Search adds native vector similarity search to MongoDB Atlas via HNSW indexing and a new
$vectorSearchaggregation stage - Embeddings are stored as array fields in regular MongoDB documents — same collection, same infrastructure
- Pre-filtering support ensures accurate metadata-filtered vector search without result count issues
- Integrates with LangChain, LlamaIndex, and Atlas's own ecosystem (App Services, Charts, Change Streams)
- Best for teams already using MongoDB Atlas who want to add semantic search without adopting a separate vector database; not available for self-hosted MongoDB