Learning Objectives
- Understand what Chroma is and why it's the most popular choice for local AI development and prototyping
- Identify Chroma's core modes: in-memory (ephemeral), persistent (local file), and client-server
- Evaluate when Chroma is the right tool vs. production-grade managed services like Pinecone
What Is Chroma?
Chroma is an open-source vector database built for AI application development. Unlike managed cloud services like Pinecone or Supabase Vector, Chroma is designed to run wherever your code runs — in memory for quick experiments, persisted to a local file for development projects, or as a server for team deployments. Its Python-first API is deliberately minimal: create a collection, add documents, query for similar results — in under 10 lines of code.
Chroma's defining characteristic is its focus on the development experience. It has built-in embedding functions (so you don't need to call a separate embedding API for prototyping), an extremely simple API that hides infrastructure complexity, and zero configuration required to get started. These traits make it the most popular vector database for learning, prototyping, and building the first version of an AI feature.
✅Tip
Try Chroma: Install with pip install chromadb. No account required, no cloud service, no API key for basic use. Full documentation at trychroma.com. Apache 2.0 license.
Core Modes
In-Memory (Default)
The simplest mode — no persistence, data is lost when the process ends:
import chromadb
client = chromadb.Client() # ephemeral, in-memory
collection = client.create_collection("my_docs")
collection.add(
documents=["AI is transforming software", "Vector databases store embeddings"],
ids=["doc1", "doc2"]
)
results = collection.query(query_texts=["how is AI changing software?"], n_results=2)
print(results["documents"])
No embedding API call needed — Chroma uses a built-in default embedding function (sentence-transformers running locally). Zero external dependencies for basic prototyping.
Persistent (Local File)
Data persisted to a local directory, survives process restarts:
client = chromadb.PersistentClient(path="./chroma_storage")
One parameter change from in-memory mode. Ideal for development projects where you want to build the index once and query repeatedly without rebuilding.
Client-Server Mode
For team environments or when you need to share the vector store:
# Start the server: chroma run --host 0.0.0.0 --port 8000
# Connect from your application
client = chromadb.HttpClient(host="localhost", port=8000)
Docker deployment available for easy team sharing or staging environments.
Built-In Embedding Functions
Chroma ships with adapters for major embedding providers — no separate embedding code required:
from chromadb.utils import embedding_functions
# Use OpenAI embeddings
openai_ef = embedding_functions.OpenAIEmbeddingFunction(
api_key="YOUR_OPENAI_KEY",
model_name="text-embedding-3-small"
)
# Use Cohere embeddings
cohere_ef = embedding_functions.CohereEmbeddingFunction(api_key="YOUR_COHERE_KEY")
# Use a local sentence-transformers model (no API key needed)
st_ef = embedding_functions.SentenceTransformerEmbeddingFunction(
model_name="all-MiniLM-L6-v2"
)
collection = client.create_collection("my_docs", embedding_function=openai_ef)
Chroma handles embedding generation automatically when you add and query documents — no separate embedding pipeline required for prototyping.
Documents, Embeddings, and Metadata
Chroma's data model has four optional fields per record:
| Field | Type | Description |
|---|---|---|
documents | string | The text content (Chroma embeds it for you) |
embeddings | list[float] | Pre-computed embedding (if you manage embedding yourself) |
metadatas | dict | Arbitrary key-value metadata for filtering |
ids | string | Unique identifier (required) |
collection.add(
documents=["Machine learning enables pattern recognition", "LLMs are trained on text"],
metadatas=[{"source": "textbook", "chapter": 1}, {"source": "blog", "date": "2026-01"}],
ids=["chunk-1", "chunk-2"]
)
# Query with metadata filter
results = collection.query(
query_texts=["how does machine learning work?"],
n_results=3,
where={"source": "textbook"} # only return textbook sources
)
💡Key Concept
When Chroma's simplicity is a feature, not a limitation: Chroma deliberately abstracts away index configuration, distance metric tuning, and cluster management. For the 80% of AI projects that need "add documents, search for similar documents" without high-throughput requirements, this abstraction is a feature — less code, faster iteration, easier onboarding. The limitation appears when you need the performance tuning, filtering complexity, or scale that managed vector databases provide.
LangChain Integration
Chroma is one of the most popular vector stores in LangChain tutorials:
from langchain_chroma import Chroma
from langchain_openai import OpenAIEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter
# Load and split documents
docs = text_splitter.split_documents(raw_documents)
# Create Chroma vector store from documents (embeds and stores automatically)
vectorstore = Chroma.from_documents(docs, OpenAIEmbeddings(), persist_directory="./chroma")
# Use as retriever in a RAG chain
retriever = vectorstore.as_retriever(search_kwargs={"k": 5})
Nearly every LangChain RAG tutorial uses Chroma as the default vector store — it's the community standard for getting started.
Pricing
Chroma is completely free and open source (Apache 2.0):
- Self-hosted (local or server): Free forever — run in-process, locally, or on any server
- Chroma Cloud: Managed cloud offering (in development as of early 2026) — pricing TBD at release
- No usage limits, no vector count limits, no API keys required for local use
Strengths
- Zero friction setup:
pip install chromadband you're ready — no account, no cloud, no configuration - Built-in embeddings: Ships with adapters for OpenAI, Cohere, and local sentence-transformers
- All modes in one library: In-memory, persistent, and client-server without changing your application code
- Python-first API: Minimal, clean, and Pythonic — smallest possible API surface for the core use case
- LangChain default: Used in nearly all LangChain tutorials; vast community resources
- Open source (Apache 2.0): No vendor lock-in; run anywhere, audit the code, modify as needed
Limitations & Considerations
- Not production-grade at scale: Performance degrades significantly at large scale vs. HNSW-indexed dedicated databases; not recommended for more than a few million vectors in production
- Limited filtering: Metadata filtering is less powerful than Pinecone's or pgvector's query expressions
- Single-node: The open-source server is single-node; no built-in clustering or high availability
- No managed cloud (yet): Chroma Cloud is in development; until release, production deployments require self-managing the server
- Limited multi-tenancy: Less mature multi-tenant support compared to Pinecone's namespaces or pgvector's RLS
Best Use Cases
| Task | Why Chroma |
|---|---|
| Learning RAG and vector search | Fastest path from zero to working vector search |
| Prototyping AI features | Build and iterate on vector search without infrastructure overhead |
| Local development environments | In-process or persistent local mode; no cloud service needed |
| Small-scale production RAG | Works well for applications with under 1 million vectors and moderate query volume |
| LangChain and LlamaIndex tutorials | The de facto standard vector store in AI framework documentation |
When to choose alternatives:
- Production at scale (1 million+ vectors, high QPS) → Pinecone or Qdrant
- Already using Supabase or PostgreSQL → Supabase Vector
- Already using MongoDB Atlas → MongoDB Atlas Vector Search
- Need self-hosted production-grade performance → Qdrant or Weaviate
Getting Started
pip install chromadb openai
import chromadb
from chromadb.utils.embedding_functions import OpenAIEmbeddingFunction
# Setup
client = chromadb.PersistentClient(path="./my_chroma_db")
ef = OpenAIEmbeddingFunction(api_key="YOUR_OPENAI_KEY", model_name="text-embedding-3-small")
collection = client.get_or_create_collection("knowledge_base", embedding_function=ef)
# Add documents (Chroma embeds them automatically)
collection.add(
documents=[
"Retrieval-Augmented Generation combines search with LLMs",
"Chroma is an open-source vector database for AI apps",
"Embeddings convert text to numerical vectors"
],
ids=["rag-1", "chroma-1", "embeddings-1"]
)
# Query (Chroma embeds the query and returns similar docs)
results = collection.query(
query_texts=["How do vector databases work?"],
n_results=2
)
print(results["documents"][0])
# ['Embeddings convert text to numerical vectors', 'Chroma is an open-source vector database for AI apps']
That's a complete working vector search system in under 20 lines. Switch PersistentClient to Client() for in-memory mode.
✅Tip
For learners: Chroma is the ideal starting point for anyone learning RAG and vector search. The built-in embedding function means you can run working examples without an API key — using local sentence-transformers models. Once you understand the core concept (store documents, query by similarity), you can apply that knowledge to Pinecone, Supabase Vector, or any other vector database. The API concepts transfer directly.
Key Takeaways
- Chroma is an open-source vector database designed for fast setup and easy prototyping —
pip install chromadband you're running in minutes with no external dependencies - Supports in-memory, persistent local, and client-server modes with the same API — pick the right mode for your stage without changing application code
- Built-in embedding functions eliminate a separate embedding pipeline during prototyping — pass documents and Chroma handles embedding automatically
- The de facto standard vector store in LangChain and LlamaIndex tutorials; vast community resources for getting started
- Best for learning, prototyping, and small-scale production (under 1 million vectors); migrate to Pinecone, Qdrant, or Supabase Vector when production scale requirements grow