Learning Objectives
- Understand what Sarvam-105B is and why sovereign AI models matter
- Evaluate the model's multilingual capabilities across India's 22 official languages
- Assess the role of government support in building nation-specific AI infrastructure
What Is Sarvam-105B?
Sarvam-105B is India's first fully domestically trained 100+ billion parameter open-source language model. Released in beta on February 20, 2026, at the AI Impact Summit in New Delhi, it supports all 22 official Indian languages — from Hindi and Bengali to Tamil, Telugu, Kannada, and Malayalam.
Built by Sarvam AI (a Bengaluru-based startup founded by former AI4Bharat researchers at IIT Madras), the model was trained on 4,086 NVIDIA H100 GPUs provided through India's government-backed IndiaAI Mission — making it a prime example of how government investment can bootstrap a nation's AI capabilities.
💡Key Concept
Sovereign AI: AI models and infrastructure developed within a country's borders, using domestic compute resources, trained on local languages and cultural data. Sovereign AI ensures a nation is not entirely dependent on American or Chinese AI providers for critical language understanding. India, France, Japan, South Korea, and the UAE are all investing in sovereign AI models.
Architecture and Specifications
| Spec | Detail |
|---|---|
| Total Parameters | 105 billion (Mixture of Experts) |
| Active Parameters | ~9-10 billion per token (MoE efficiency) |
| Context Window | 128,000 tokens |
| Training Data | 12 trillion tokens (code, web, specialized knowledge, math, multilingual) |
| Languages | All 22 official Indian languages |
| License | Apache 2.0 (fully open source) |
| Companion Model | Sarvam-30B (released simultaneously) |
| Availability | Hugging Face (sarvamai/sarvam-105b) and India's AIKosh platform |
The MoE architecture means Sarvam-105B activates only about 10 billion parameters per query — giving it the knowledge breadth of a 105 billion parameter model at the inference cost of a much smaller one.
Why It Matters
India has 1.4 billion people speaking 22 official languages (and hundreds of dialects). Until Sarvam-105B, no AI model adequately covered this linguistic diversity:
- GPT-5.5 and Claude — optimized primarily for English; limited Indian language support
- Llama and Mistral — open-source but trained predominantly on English and European language data
- Sarvam-105B — trained with substantial allocation to the 10 most-spoken Indian languages, with coverage of all 22 scheduled languages
This makes Sarvam-105B essential for Indian government services, education, healthcare, and commerce — where citizens interact in their native languages, not English.
Government Support
Sarvam AI was selected as one of 12 organizations under India's IndiaAI Mission:
- 4,086 NVIDIA H100 GPUs provided for a 6-month training period through public-private partnership
- 246.72 crore rupees (~$29 million) in government compute and financial support
- Open-source weights published on India's national AIKosh platform for other organizations to build upon
Company Details
| Detail | Info |
|---|---|
| Company | Sarvam AI |
| Founded | August 2023 |
| CEO | Vivek Raghavan (co-founder; previously AI4Bharat at IIT Madras) |
| Headquarters | Bengaluru, Karnataka, India |
| Funding | ~$41 million (Seed + Series A; Lightspeed, Peak XV, Khosla Ventures) |
| Pending Round | $250 million from NVIDIA, Accel, and HCLTech at $1.5 billion valuation (reported March 2026; not yet confirmed) |
| Government Support | IndiaAI Mission — 4,086 H100 GPUs + ~$29 million in support |
| Website | sarvam.ai |
Strengths
- Only model covering all 22 Indian languages — no Western or Chinese model matches this linguistic breadth for India
- Open source (Apache 2.0) — freely available for Indian startups, government agencies, and enterprises to build upon
- MoE efficiency — 105 billion parameters with only ~10 billion active per query; cost-effective inference
- Government backing — IndiaAI Mission support validates the model for public sector use
- Potential unicorn — reported $1.5 billion valuation round with NVIDIA would make it India's first AI unicorn
Limitations and Considerations
- Beta status — released in beta (February 2026); final production release timeline unclear
- Funding still pending — the $250 million round at $1.5 billion is reported but not confirmed as closed
- Benchmark comparisons limited — Indian language AI benchmarks are less standardized than English ones; direct comparison to frontier models is difficult
- Small team — startup with ~$41 million raised (pre-pending round); limited capacity versus well-funded competitors
- Government GPU access is temporary — the 4,086 H100s were provided for a 6-month period; long-term compute strategy unclear
Key Takeaways
- Sarvam-105B is India's first sovereign 100+ billion parameter AI model — open-source, supporting all 22 official Indian languages, trained on government-provided NVIDIA GPUs
- MoE architecture (105 billion total, ~10 billion active) provides frontier-class knowledge at efficient inference costs
- Critical for serving India's 1.4 billion people in their native languages across government, education, healthcare, and commerce
- In talks to raise $250 million from NVIDIA at a $1.5 billion valuation — would be India's first AI unicorn