Learning Objectives

Understand what document-intelligence OCR does and why it matters for AI pipelines
Explain how Mistral OCR 4 differs from traditional optical character recognition
Evaluate when a self-hostable, structure-aware OCR model is the right choice

What Is Mistral OCR 4?

Mistral OCR 4 is a document-intelligence model from French AI lab Mistral. Where traditional optical character recognition (OCR) just turns an image of text into a flat string of characters, Mistral OCR 4 reads a scanned or photographed document and returns its structure — the text, but also the tables, headings, reading order, and the position of each block on the page, with a confidence score attached to each piece.

That structure is the point. Most enterprise AI projects begin with a pile of documents — contracts, invoices, lab reports, forms, scanned archives — that need to become clean, machine-readable data before a model can do anything with them. Mistral OCR 4 is built to be the ingestion layer for those pipelines, turning messy real-world documents into structured output that a retrieval system or an AI agent can use directly.

💡Key Concept

OCR versus document intelligence. Plain OCR answers "what characters are in this image?" Document intelligence answers "what is this document, and how is it organized?" — preserving tables as tables, keeping headings and reading order, and tagging where each element sits on the page. That difference is what makes the output usable for search and retrieval rather than a wall of undifferentiated text.

What Makes It Different

Three choices set Mistral OCR 4 apart from the OCR built into most cloud platforms.

It is structure-aware. The model returns bounding boxes, block classification (is this a heading, a table cell, a caption?), and inline confidence scores alongside the extracted text — so downstream systems can trust, route, or flag content based on how certain the model is.

It is multilingual at scale. It supports 170 languages across ten language groups in a single model, rather than needing a different engine per script — useful for global enterprises and for archives that mix languages on one page.

It is self-hostable. The whole model runs in a single container on a company's own infrastructure, so sensitive documents never have to leave the building. For regulated industries — healthcare, legal, finance, government — that data-residency story is often the deciding factor.

✅Tip

Visit Mistral OCR 4: mistral.ai/news/ocr-4. It is available through the Mistral API and as Document AI inside Mistral Studio for no-code processing, and is also offered through Amazon SageMaker and Microsoft Foundry.

Performance

In Mistral's evaluation, independent annotators preferred OCR 4 to every leading OCR and document-AI system tested, with win rates averaging 72 percent. It also posts the top overall score on OlmOCRBench, a public benchmark for document extraction, at 85.2. Mistral additionally reports a meaningful speed advantage over competing systems on the same workloads.

Benchmarks are not the whole story for OCR — real documents are messier than test sets — but a consistent preference across independent raters, plus the leading public-benchmark score, is a strong signal that the model handles difficult layouts well.

Pricing

Plan	Price	Features
API	$4 per 1,000 pages	Full structure-aware extraction 170 languages Bounding boxes + confidence scores
Batch API	$2 per 1,000 pages	Same model, 50 percent discount Best for large archives Asynchronous processing
Self-Hosted	Enterprise pricing	Single-container deployment Runs on your own servers Data never leaves your infrastructure

API$4 per 1,000 pages

Full structure-aware extraction
170 languages
Bounding boxes + confidence scores

Batch API$2 per 1,000 pages

Same model, 50 percent discount
Best for large archives
Asynchronous processing

Self-HostedEnterprise pricing

Single-container deployment
Runs on your own servers
Data never leaves your infrastructure

At $4 per 1,000 pages through the API — halved to $2 per 1,000 pages with the Batch API for large jobs — Mistral OCR 4 is priced to make whole-archive digitization economical, not just one-off documents. The self-hosted option is the path for organizations with strict data-residency requirements.

Strengths

Structure, not just text — returns tables, layout, reading order, and per-block confidence, so output is usable for retrieval and agents without heavy post-processing
170 languages in one model — handles multilingual and mixed-script documents without swapping engines
Self-hostable in a single container — sensitive documents can stay on-premises, a major draw for regulated industries
Benchmark-leading quality — preferred over every rival in blind comparison and tops OlmOCRBench
Priced for scale — $4 per 1,000 pages, or $2 with the Batch API, makes large-archive ingestion affordable

Limitations & Considerations

It is an ingestion layer, not an answer engine — OCR 4 structures documents; you still need a retrieval system or model on top to reason over them
Self-hosting requires infrastructure — running the container in-house means provisioning and maintaining GPU capacity
OCR is never perfect on the worst inputs — heavily degraded scans, handwriting, and unusual layouts still challenge any system; the confidence scores help flag these
Newest release — OCR 4 shipped in June 2026; independent, third-party benchmarks beyond Mistral's own evaluation are still accumulating

Key Takeaways

Mistral OCR 4 is a document-intelligence model that extracts structured text, tables, and layout — not just flat characters — from scanned files
It supports 170 languages in one model, returns bounding boxes and per-block confidence scores, and runs self-hosted in a single container so documents stay on-premises
It was preferred over every leading OCR system by independent annotators and leads the OlmOCRBench benchmark
Priced at $4 per 1,000 pages (or $2 via the Batch API), it is built to be the ingestion layer for enterprise search, retrieval-augmented generation, and AI-agent pipelines
Best understood as the step that turns messy real-world documents into clean, machine-readable data — the unglamorous but essential front end of most enterprise AI projects

Mistral OCR 4

Audio & video lessons are paid features