Free to read. Sign up to save your progress and take knowledge-check quizzes.

Sign up free
6 min read·Updated June 25, 2026

Mistral OCR 4 is a document-intelligence model that extracts structured text, tables, and page layout from scanned files across 170 languages — and runs in a single container on your own servers. Independent annotators preferred it to every leading OCR system tested, and it leads the OlmOCRBench benchmark, at $4 per 1,000 pages.

Listen to this lesson

Free preview · first 0:30
0:00 / 0:30

Audio & video lessons are paid features

Plus unlocks audio streaming. Pro adds downloadable audio, video, certificates, and more.

Plus adds:
  • Audio streaming
  • Downloadable PDFs
  • All AI Playbooks
  • Personalized content
Pro also adds:
  • Certificates of completion
  • Audio MP3 downloads
  • Video lessonssoon
  • & More…soon

Watch this lesson

AI Pro Playbook video — coming soon

Learning Objectives

  • Understand what document-intelligence OCR does and why it matters for AI pipelines
  • Explain how Mistral OCR 4 differs from traditional optical character recognition
  • Evaluate when a self-hostable, structure-aware OCR model is the right choice

What Is Mistral OCR 4?

Mistral OCR 4 is a document-intelligence model from French AI lab Mistral. Where traditional optical character recognition (OCR) just turns an image of text into a flat string of characters, Mistral OCR 4 reads a scanned or photographed document and returns its structure — the text, but also the tables, headings, reading order, and the position of each block on the page, with a confidence score attached to each piece.

That structure is the point. Most enterprise AI projects begin with a pile of documents — contracts, invoices, lab reports, forms, scanned archives — that need to become clean, machine-readable data before a model can do anything with them. Mistral OCR 4 is built to be the ingestion layer for those pipelines, turning messy real-world documents into structured output that a retrieval system or an AI agent can use directly.

💡Key Concept

OCR versus document intelligence. Plain OCR answers "what characters are in this image?" Document intelligence answers "what is this document, and how is it organized?" — preserving tables as tables, keeping headings and reading order, and tagging where each element sits on the page. That difference is what makes the output usable for search and retrieval rather than a wall of undifferentiated text.

What Makes It Different

Three choices set Mistral OCR 4 apart from the OCR built into most cloud platforms.

It is structure-aware. The model returns bounding boxes, block classification (is this a heading, a table cell, a caption?), and inline confidence scores alongside the extracted text — so downstream systems can trust, route, or flag content based on how certain the model is.

It is multilingual at scale. It supports 170 languages across ten language groups in a single model, rather than needing a different engine per script — useful for global enterprises and for archives that mix languages on one page.

It is self-hostable. The whole model runs in a single container on a company's own infrastructure, so sensitive documents never have to leave the building. For regulated industries — healthcare, legal, finance, government — that data-residency story is often the deciding factor.

Tip

Visit Mistral OCR 4: mistral.ai/news/ocr-4. It is available through the Mistral API and as Document AI inside Mistral Studio for no-code processing, and is also offered through Amazon SageMaker and Microsoft Foundry.

Performance

In Mistral's evaluation, independent annotators preferred OCR 4 to every leading OCR and document-AI system tested, with win rates averaging 72 percent. It also posts the top overall score on OlmOCRBench, a public benchmark for document extraction, at 85.2. Mistral additionally reports a meaningful speed advantage over competing systems on the same workloads.

Benchmarks are not the whole story for OCR — real documents are messier than test sets — but a consistent preference across independent raters, plus the leading public-benchmark score, is a strong signal that the model handles difficult layouts well.

Pricing

API$4 per 1,000 pages
  • Full structure-aware extraction
  • 170 languages
  • Bounding boxes + confidence scores
Batch API$2 per 1,000 pages
  • Same model, 50 percent discount
  • Best for large archives
  • Asynchronous processing
Self-HostedEnterprise pricing
  • Single-container deployment
  • Runs on your own servers
  • Data never leaves your infrastructure

At $4 per 1,000 pages through the API — halved to $2 per 1,000 pages with the Batch API for large jobs — Mistral OCR 4 is priced to make whole-archive digitization economical, not just one-off documents. The self-hosted option is the path for organizations with strict data-residency requirements.

Strengths

  • Structure, not just text — returns tables, layout, reading order, and per-block confidence, so output is usable for retrieval and agents without heavy post-processing
  • 170 languages in one model — handles multilingual and mixed-script documents without swapping engines
  • Self-hostable in a single container — sensitive documents can stay on-premises, a major draw for regulated industries
  • Benchmark-leading quality — preferred over every rival in blind comparison and tops OlmOCRBench
  • Priced for scale — $4 per 1,000 pages, or $2 with the Batch API, makes large-archive ingestion affordable

Limitations & Considerations

  • It is an ingestion layer, not an answer engine — OCR 4 structures documents; you still need a retrieval system or model on top to reason over them
  • Self-hosting requires infrastructure — running the container in-house means provisioning and maintaining GPU capacity
  • OCR is never perfect on the worst inputs — heavily degraded scans, handwriting, and unusual layouts still challenge any system; the confidence scores help flag these
  • Newest release — OCR 4 shipped in June 2026; independent, third-party benchmarks beyond Mistral's own evaluation are still accumulating

Key Takeaways

  • Mistral OCR 4 is a document-intelligence model that extracts structured text, tables, and layout — not just flat characters — from scanned files
  • It supports 170 languages in one model, returns bounding boxes and per-block confidence scores, and runs self-hosted in a single container so documents stay on-premises
  • It was preferred over every leading OCR system by independent annotators and leads the OlmOCRBench benchmark
  • Priced at $4 per 1,000 pages (or $2 via the Batch API), it is built to be the ingestion layer for enterprise search, retrieval-augmented generation, and AI-agent pipelines
  • Best understood as the step that turns messy real-world documents into clean, machine-readable data — the unglamorous but essential front end of most enterprise AI projects

Save your progress & take the quiz

Sign up free to bookmark lessons, track which modules you've completed, and lock in what you learned with a quick knowledge-check quiz at the end of each lesson.

🧭Recommended for you