Name: Mistral Small 4
Availability: InStock
Author: Mistral AI

Learning Objectives

Understand Mistral Small 4's MoE architecture and efficiency advantages
Compare Mistral Small 4 against other small-to-mid-size open-source models
Evaluate deployment scenarios where Mistral Small 4 is the right choice

What Is Mistral Small 4?

Mistral Small 4 is Mistral AI's efficient Mixture-of-Experts (MoE) model, released on March 16, 2026 under the Apache 2.0 license — the most permissive license in the major model ecosystem.

The architecture uses 128 MoE experts with approximately 6.5 billion parameters active per token out of 119 billion total parameters. This means the model delivers quality comparable to much larger dense models while using only a fraction of the compute per inference.

✅Tip

Access Mistral Small 4: Download from mistral.ai or Hugging Face. Also available through Mistral's La Plateforme API and Le Chat.

Architecture

Specification	Value
Total parameters	119 billion
Active parameters per token	Approximately 6.5 billion
Number of experts	128 (4 active per token)
Context window	256,000 tokens
License	Apache 2.0
Released	March 16, 2026

The Mixture-of-Experts architecture is key to Mistral Small 4's efficiency: instead of activating all 119 billion parameters for every token, the model routes each token through only 4 of its 128 specialized experts (~6.5 billion parameters). This achieves quality close to a dense 119 billion parameter model at the inference cost of a 6.5 billion parameter model.

Mistral Small 4 vs. Other Models

Model	Parameters (Active)	Context	License	Key Strength
Mistral Small 4	6.5 billion (of 119 billion MoE)	256,000	Apache 2.0	Fully open; efficient MoE; long context
Llama 3.3 70 billion	70 billion (dense)	128,000	Meta Community	Most deployed open-weight; proven reliability
Phi-4 14 billion	14 billion (dense)	16,000	MIT	Small and fast; strong reasoning per parameter
Claude Haiku 4.5	Undisclosed	200,000	Closed API	Fastest Claude; sub-200ms; $0.80/$4 per million tokens

Key Advantages

Apache 2.0 License

Mistral Small 4 uses the Apache 2.0 license — the most permissive widely-used open-source license. Unlike Meta's community license (which restricts commercial use above 1 million monthly active users), Apache 2.0 has:

No usage restrictions at any scale
No commercial limitations
Freedom to modify, distribute, and build proprietary products
Full compatibility with enterprise legal requirements

256,000 Token Context Window

The 256,000 token context window (~192,000 words) is among the longest for an open-weight model of this efficiency class, enabling:

Full document analysis without chunking
Long conversation histories
Multi-file code understanding
Research paper processing in a single context

Efficient Inference

At approximately 6.5 billion active parameters per token, Mistral Small 4 can run on:

A single high-end consumer GPU (NVIDIA RTX 4090 or A100)
Moderate cloud instances without premium GPU allocation
Edge deployment scenarios with sufficient hardware

Strengths

Apache 2.0 — most permissive license; no commercial restrictions at any scale
Efficient MoE — 119 billion total but only 6.5 billion active per token; excellent quality-per-compute
256,000 token context — among the longest for open-weight models in this efficiency class
128 experts — high specialization across the expert pool
European AI — built by Mistral AI (Paris); may meet EU data sovereignty preferences
Self-hostable — full control over data and deployment

Limitations and Considerations

Not frontier-class — does not compete with Opus 4.7, GPT-5.5, or Gemini 3.1 Pro on the hardest benchmarks
MoE complexity — Mixture-of-Experts models can be harder to fine-tune and deploy compared to dense models
Memory requirements — while inference is efficient, loading 119 billion total parameters requires significant VRAM
Newer model — released March 2026; community tools and fine-tuned variants are still emerging
Mistral ecosystem — smaller community than Llama or OpenAI ecosystems

Company Details

Detail	Info
Developer	Mistral AI (Paris, France)
Released	March 16, 2026
License	Apache 2.0 (fully open-source)
Architecture	Mixture-of-Experts (128 experts, 4 active per token)
Total parameters	119 billion
Active per token	Approximately 6.5 billion
Context window	256,000 tokens
Website	mistral.ai

Mistral Large 3 — Mistral's flagship model (675 billion MoE)
Devstral — Mistral's coding-focused model
Voxtral TTS — Mistral's open-source text-to-speech model
Llama 3.3 70 billion — Meta's most deployed open-weight production model

Key Takeaways

Mistral Small 4 is an efficient MoE model — 119 billion total parameters with only 6.5 billion active per token across 128 experts, delivering strong quality at low inference cost
Released under Apache 2.0 — the most permissive license available, with no commercial restrictions at any scale
256,000 token context window enables full document analysis and long conversations without chunking
Runs on a single high-end GPU; suitable for self-hosted enterprise deployments with data sovereignty requirements
Not frontier-class — best suited for production applications where efficiency and openness matter more than maximum benchmark scores

Mistral Small 4

Audio & video lessons are paid features

Learning Objectives

What Is Mistral Small 4?

Architecture

Mistral Small 4 vs. Other Models

Key Advantages

Apache 2.0 License

256,000 Token Context Window

Efficient Inference

Strengths

Limitations and Considerations

Company Details

Key Takeaways

Save your progress & take the quiz

Audio & video lessons are paid features

Learning Objectives

What Is Mistral Small 4?

Architecture

Mistral Small 4 vs. Other Models

Key Advantages

Apache 2.0 License

256,000 Token Context Window

Efficient Inference

Strengths

Limitations and Considerations

Company Details

Related Tools

Key Takeaways

Save your progress & take the quiz