Learning Objectives
- Understand what LlamaIndex is and how it differs from LangChain in its RAG focus
- Identify LlamaIndex's core components: data connectors, indexes, query engines, and agents
- Evaluate when LlamaIndex is the right choice for RAG and data-heavy AI applications
What Is LlamaIndex?
LlamaIndex (formerly GPT Index) is an open-source data framework for building LLM-powered applications over custom data, created by Jerry Liu in 2022. While LangChain is a broad general-purpose LLM application framework, LlamaIndex has a more focused mission: make it easy to connect LLMs to any data source with the highest quality retrieval.
LlamaIndex provides purpose-built abstractions for the complete RAG pipeline — ingesting data from 150+ sources, chunking and indexing it intelligently, retrieving the most relevant context efficiently, and synthesizing final answers — with more retrieval-focused features and optimizations than LangChain's equivalent components.
✅Tip
Try LlamaIndex: Open source at llamaindex.ai; pip install llama-index; LlamaCloud (managed hosting) available; Apache 2.0 license
Core Components
Data Connectors (LlamaHub)
LlamaHub is LlamaIndex's community library of 150+ data connectors:
- Documents: PDF, Word, PowerPoint, CSV, HTML, Markdown, JSON
- Databases: PostgreSQL, MySQL, MongoDB, Supabase, Snowflake
- Cloud services: Google Drive, OneDrive, Notion, Confluence, Slack, GitHub
- APIs: Wikipedia, ArXiv, PubMed, Twitter/X, YouTube
- Web: Firecrawl, Selenium WebDriver, sitemap crawlers
Data connectors standardize diverse data sources into Document objects that LlamaIndex can index.
Indexes — Intelligent Data Organization
LlamaIndex provides multiple indexing strategies for different access patterns:
- VectorStoreIndex: Embed documents as vectors; retrieve by semantic similarity — the most common choice for RAG
- SummaryIndex: Index that can summarize over a full document set rather than retrieve chunks
- TreeIndex: Hierarchical index for large documents with multiple levels of summarization
- KeywordTableIndex: Build a keyword-to-document mapping for keyword-based retrieval
The choice of index affects retrieval quality significantly — more nuanced than LangChain's vector store focus.
Query Engines
Query engines translate user questions into retrieved context and synthesized answers:
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
documents = SimpleDirectoryReader("./docs").load_data()
index = VectorStoreIndex.from_documents(documents)
query_engine = index.as_query_engine()
response = query_engine.query("What are the key findings from the research reports?")
print(response)
Advanced Retrieval Techniques
LlamaIndex's retrieval features go beyond basic vector search:
- Sub-question query engine: Decompose complex questions into sub-questions, retrieve separately, then synthesize
- Recursive retrieval: Start with a high-level summary; drill down into details as needed
- Metadata filtering: Filter retrieved documents by date, source, author, or any custom metadata
- Hybrid search: Combine vector similarity and keyword search for better recall
- Re-ranking: Apply a cross-encoder model to re-rank retrieved chunks for better precision
- Sentence window retrieval: Retrieve individual sentences but return expanded window context
💡Key Concept
Why LlamaIndex vs. LangChain for RAG: Both frameworks support RAG, but LlamaIndex was designed specifically for this use case. LlamaIndex provides more retrieval optimization options (sub-question decomposition, recursive retrieval, hybrid search, re-ranking), better metadata handling, and a larger selection of indexes. If your application is primarily about querying your own data, LlamaIndex's more specialized focus often produces better retrieval quality with less custom engineering.
Agents and Workflows
LlamaIndex has extended into agentic workflows:
- ReAct Agents: Tool-using agents with web search, API calls, code execution
- OpenAI Function Calling Agents: Structured tool use via function calling
- LlamaAgents: Distributed multi-agent framework for production workflows
- Workflows (new): Event-driven workflow system for complex multi-step pipelines with loops, branches, and human-in-the-loop steps
LlamaCloud
LlamaIndex's managed cloud platform:
- LlamaParse: Highly accurate PDF parsing — better than standard PDF parsers for complex layouts, tables, and charts
- Managed indexes: Hosted vector indexes without self-managed infrastructure
- Data pipelines: Scheduled ingestion and index maintenance
Pricing
LlamaIndex is free and open source (Apache 2.0). LlamaCloud pricing:
- 1,000 pages/month
- Limited
- Evaluation
- 3,000 pages
- 1 managed index
- Individual developers
- 15,000 pages
- 5 managed indexes
- Production applications
- Unlimited
- Large organizations
Strengths
- Best-in-class RAG: Most retrieval optimization options; deepest focus on data ingestion quality
- LlamaParse: Significantly better PDF parsing than generic approaches — handles complex tables, multi-column layouts, charts
- 150+ data connectors: Broadest data source coverage in any RAG framework
- Advanced retrieval techniques: Sub-question decomposition, recursive retrieval, hybrid search, re-ranking
- Event-driven workflows: New Workflows system is well-designed for complex agent orchestration
- Production use cases: Well-documented patterns for enterprise RAG deployments
Limitations & Considerations
- Narrower than LangChain: Less suitable as a general-purpose LLM framework outside of data-heavy use cases
- Smaller ecosystem than LangChain: Fewer community contributions and third-party integrations
- Documentation fragmentation: Rapid development means some features are better documented than others
- LlamaCloud cost: LlamaParse and managed indexes have their own pricing on top of LLM costs
Best Use Cases
| Task | Why LlamaIndex |
|---|---|
| High-quality RAG over documents | Best retrieval optimization and indexing options |
| Complex PDF processing | LlamaParse handles tables, charts, and multi-column layouts |
| Enterprise knowledge base search | Metadata filtering; hybrid search; production deployment |
| Multi-source data retrieval | 150+ connectors; unified query interface |
| Research paper analysis | Academic database connectors + recursive retrieval |
| Sub-question complex queries | Sub-question query engine decomposes and synthesizes |
When to choose alternatives:
- General-purpose LLM applications (beyond RAG) → LangChain
- Multi-agent collaboration → CrewAI or AG2
- No-code agent building → Relevance AI
- Simple RAG prototype → LangChain or direct vector store SDK
Getting Started
pip install llama-index llama-index-llms-openai llama-index-embeddings-openai
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
import os
os.environ["OPENAI_API_KEY"] = "your-key"
# Load documents from a folder
documents = SimpleDirectoryReader("./documents").load_data()
# Build index
index = VectorStoreIndex.from_documents(documents)
# Query
query_engine = index.as_query_engine()
response = query_engine.query("Summarize the main findings across all documents")
print(response)
✅Tip
LlamaParse for complex PDFs: If your RAG application processes PDF documents with tables, figures, or complex multi-column layouts, try LlamaParse before defaulting to PyPDF2 or pdfminer. LlamaParse uses a vision model to understand the document structure and extracts content significantly more accurately — especially for financial reports, research papers, and technical documentation. The free tier (1,000 pages/month) is enough to evaluate the quality improvement for your document type.
Key Takeaways
- LlamaIndex is a specialized RAG framework with the most comprehensive retrieval optimization options — sub-question decomposition, recursive retrieval, hybrid search, re-ranking
- LlamaParse provides highly accurate PDF parsing for complex documents with tables and figures
- 150+ data connectors cover virtually every data source a RAG application might need
- More focused than LangChain for data-intensive RAG applications; LangChain is broader for general LLM apps
- Free and open source; LlamaCloud managed services (LlamaParse, hosted indexes) have their own pricing