Name: Chroma
Availability: InStock
Author: Chroma

Learning Objectives

Understand what Chroma is and why it's the most popular choice for local AI development and prototyping
Identify Chroma's core modes: in-memory (ephemeral), persistent (local file), and client-server
Evaluate when Chroma is the right tool vs. production-grade managed services like Pinecone

What Is Chroma?

Chroma is an open-source vector database built for AI application development. Unlike managed cloud services like Pinecone or Supabase Vector, Chroma is designed to run wherever your code runs — in memory for quick experiments, persisted to a local file for development projects, or as a server for team deployments. Its Python-first API is deliberately minimal: create a collection, add documents, query for similar results — in under 10 lines of code.

Chroma's defining characteristic is its focus on the development experience. It has built-in embedding functions (so you don't need to call a separate embedding API for prototyping), an extremely simple API that hides infrastructure complexity, and zero configuration required to get started. These traits make it the most popular vector database for learning, prototyping, and building the first version of an AI feature.

✅Tip

Try Chroma: Install with pip install chromadb. No account required, no cloud service, no API key for basic use. Full documentation at trychroma.com. Apache 2.0 license.

Core Modes

In-Memory (Default)

The simplest mode — no persistence, data is lost when the process ends:

import chromadb

client = chromadb.Client()  # ephemeral, in-memory
collection = client.create_collection("my_docs")

collection.add(
    documents=["AI is transforming software", "Vector databases store embeddings"],
    ids=["doc1", "doc2"]
)

results = collection.query(query_texts=["how is AI changing software?"], n_results=2)
print(results["documents"])

No embedding API call needed — Chroma uses a built-in default embedding function (sentence-transformers running locally). Zero external dependencies for basic prototyping.

Persistent (Local File)

Data persisted to a local directory, survives process restarts:

client = chromadb.PersistentClient(path="./chroma_storage")

One parameter change from in-memory mode. Ideal for development projects where you want to build the index once and query repeatedly without rebuilding.

Client-Server Mode

For team environments or when you need to share the vector store:

# Start the server: chroma run --host 0.0.0.0 --port 8000

# Connect from your application
client = chromadb.HttpClient(host="localhost", port=8000)

Docker deployment available for easy team sharing or staging environments.

Built-In Embedding Functions

Chroma ships with adapters for major embedding providers — no separate embedding code required:

from chromadb.utils import embedding_functions

# Use OpenAI embeddings
openai_ef = embedding_functions.OpenAIEmbeddingFunction(
    api_key="YOUR_OPENAI_KEY",
    model_name="text-embedding-3-small"
)

# Use Cohere embeddings
cohere_ef = embedding_functions.CohereEmbeddingFunction(api_key="YOUR_COHERE_KEY")

# Use a local sentence-transformers model (no API key needed)
st_ef = embedding_functions.SentenceTransformerEmbeddingFunction(
    model_name="all-MiniLM-L6-v2"
)

collection = client.create_collection("my_docs", embedding_function=openai_ef)

Chroma handles embedding generation automatically when you add and query documents — no separate embedding pipeline required for prototyping.

Documents, Embeddings, and Metadata

Chroma's data model has four optional fields per record:

Field	Type	Description
`documents`	string	The text content (Chroma embeds it for you)
`embeddings`	list[float]	Pre-computed embedding (if you manage embedding yourself)
`metadatas`	dict	Arbitrary key-value metadata for filtering
`ids`	string	Unique identifier (required)

collection.add(
    documents=["Machine learning enables pattern recognition", "LLMs are trained on text"],
    metadatas=[{"source": "textbook", "chapter": 1}, {"source": "blog", "date": "2026-01"}],
    ids=["chunk-1", "chunk-2"]
)

# Query with metadata filter
results = collection.query(
    query_texts=["how does machine learning work?"],
    n_results=3,
    where={"source": "textbook"}  # only return textbook sources
)

💡Key Concept

When Chroma's simplicity is a feature, not a limitation: Chroma deliberately abstracts away index configuration, distance metric tuning, and cluster management. For the 80% of AI projects that need "add documents, search for similar documents" without high-throughput requirements, this abstraction is a feature — less code, faster iteration, easier onboarding. The limitation appears when you need the performance tuning, filtering complexity, or scale that managed vector databases provide.

LangChain Integration

Chroma is one of the most popular vector stores in LangChain tutorials:

from langchain_chroma import Chroma
from langchain_openai import OpenAIEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter

# Load and split documents
docs = text_splitter.split_documents(raw_documents)

# Create Chroma vector store from documents (embeds and stores automatically)
vectorstore = Chroma.from_documents(docs, OpenAIEmbeddings(), persist_directory="./chroma")

# Use as retriever in a RAG chain
retriever = vectorstore.as_retriever(search_kwargs={"k": 5})

Nearly every LangChain RAG tutorial uses Chroma as the default vector store — it's the community standard for getting started.

Pricing

Chroma is completely free and open source (Apache 2.0):

Self-hosted (local or server): Free forever — run in-process, locally, or on any server
Chroma Cloud: Managed cloud offering (in development as of early 2026) — pricing TBD at release
No usage limits, no vector count limits, no API keys required for local use

Strengths

Zero friction setup: pip install chromadb and you're ready — no account, no cloud, no configuration
Built-in embeddings: Ships with adapters for OpenAI, Cohere, and local sentence-transformers
All modes in one library: In-memory, persistent, and client-server without changing your application code
Python-first API: Minimal, clean, and Pythonic — smallest possible API surface for the core use case
LangChain default: Used in nearly all LangChain tutorials; vast community resources
Open source (Apache 2.0): No vendor lock-in; run anywhere, audit the code, modify as needed

Limitations & Considerations

Not production-grade at scale: Performance degrades significantly at large scale vs. HNSW-indexed dedicated databases; not recommended for more than a few million vectors in production
Limited filtering: Metadata filtering is less powerful than Pinecone's or pgvector's query expressions
Single-node: The open-source server is single-node; no built-in clustering or high availability
No managed cloud (yet): Chroma Cloud is in development; until release, production deployments require self-managing the server
Limited multi-tenancy: Less mature multi-tenant support compared to Pinecone's namespaces or pgvector's RLS

Best Use Cases

Task	Why Chroma
Learning RAG and vector search	Fastest path from zero to working vector search
Prototyping AI features	Build and iterate on vector search without infrastructure overhead
Local development environments	In-process or persistent local mode; no cloud service needed
Small-scale production RAG	Works well for applications with under 1 million vectors and moderate query volume
LangChain and LlamaIndex tutorials	The de facto standard vector store in AI framework documentation

When to choose alternatives:

Production at scale (1 million+ vectors, high QPS) → Pinecone or Qdrant
Already using Supabase or PostgreSQL → Supabase Vector
Already using MongoDB Atlas → MongoDB Atlas Vector Search
Need self-hosted production-grade performance → Qdrant or Weaviate

Getting Started

pip install chromadb openai

import chromadb
from chromadb.utils.embedding_functions import OpenAIEmbeddingFunction

# Setup
client = chromadb.PersistentClient(path="./my_chroma_db")
ef = OpenAIEmbeddingFunction(api_key="YOUR_OPENAI_KEY", model_name="text-embedding-3-small")
collection = client.get_or_create_collection("knowledge_base", embedding_function=ef)

# Add documents (Chroma embeds them automatically)
collection.add(
    documents=[
        "Retrieval-Augmented Generation combines search with LLMs",
        "Chroma is an open-source vector database for AI apps",
        "Embeddings convert text to numerical vectors"
    ],
    ids=["rag-1", "chroma-1", "embeddings-1"]
)

# Query (Chroma embeds the query and returns similar docs)
results = collection.query(
    query_texts=["How do vector databases work?"],
    n_results=2
)
print(results["documents"][0])
# ['Embeddings convert text to numerical vectors', 'Chroma is an open-source vector database for AI apps']

That's a complete working vector search system in under 20 lines. Switch PersistentClient to Client() for in-memory mode.