Free to read. Sign up to save your progress and take knowledge-check quizzes.

Sign up free
5 min read·Updated June 23, 2026

VibeThinker-3B

Weibo logoBy Weibo

VibeThinker-3B is a 3.1 billion-parameter open reasoning model from Weibo AI that posts frontier-level math and coding-reasoning scores despite its tiny size — runnable on a single consumer GPU. Its benchmark claims are strong but contested, making it a useful case study in small-model reasoning.

Listen to this lesson

Free preview · first 0:30
0:00 / 0:30

Audio & video lessons are paid features

Plus unlocks audio streaming. Pro adds downloadable audio, video, certificates, and more.

Plus adds:
  • Audio streaming
  • Downloadable PDFs
  • All AI Playbooks
  • Personalized content
Pro also adds:
  • Certificates of completion
  • Audio MP3 downloads
  • Video lessonssoon
  • & More…soon

Watch this lesson

Video coming soon

Learning Objectives

  • Understand what VibeThinker-3B is and why a 3 billion-parameter model drew outsized attention
  • Recognize the training ideas (curriculum fine-tuning plus reinforcement learning) behind its reasoning ability
  • Evaluate its benchmark claims with appropriate skepticism

What Is VibeThinker-3B?

VibeThinker-3B is an open-weights reasoning model released by Weibo AI in June 2026, described in the paper "VibeThinker-3B: Exploring the Frontier of Verifiable Reasoning in Small Language Models." It has just 3.1 billion parameters — small enough to run on a single consumer graphics card with about 6.7 gigabytes of memory — yet it posts math and coding-reasoning scores that rival models many times its size.

The model is built on the Qwen2.5-Coder-3B base and released under the permissive MIT license, with weights and training code published on Hugging Face and GitHub. The headline idea is that careful post-training, not raw scale, can unlock strong step-by-step reasoning in a tiny model.

💡Key Concept

Why "small model" matters. Most frontier reasoning lives in models with hundreds of billions of parameters that need data-center hardware. A 3 billion-parameter model that reasons well can run on a laptop or a single GPU — cheaper to serve, easier to study, and far more accessible to students and independent developers.

How It Was Trained

VibeThinker-3B follows what its authors call the Spectrum-to-Signal Principle:

  • Curriculum fine-tuning — a two-stage supervised phase that starts with a broad spectrum of valid reasoning examples, then shifts to harder, longer problems.
  • Multi-domain reinforcement learning — a stage that amplifies correct reasoning using verifiable rewards, via a technique the team calls MaxEnt-Guided Policy Optimization (a variant of the GRPO objective used widely in reasoning models).
  • Offline self-distillation — a final pass that consolidates the model's best behaviors.

The bet is that diversity in the fine-tuning data plus reward-driven reinforcement can elicit large-model reasoning from a small base.

Benchmark Claims (and Why to Be Careful)

On paper, VibeThinker-3B's reported scores are remarkable for its size:

BenchmarkReported scoreWhat it measures
AIME 2026 (math)94.3 (97.1 with test-time scaling)Hard competition mathematics
LiveCodeBench v680.2 Pass@1Recent competitive-programming problems
LeetCode (unseen contests)96.1% acceptanceOut-of-distribution code generalization
IFEval93.4Instruction following

Those numbers put a 3 billion-parameter model in the range of far larger systems on specific reasoning tasks — which is exactly why they have been contested. Independent observers have questioned whether the evaluation setup, test-time scaling, and benchmark selection flatter the model, and small models often generalize worse outside the narrow tasks they were tuned for. Treat VibeThinker-3B as a striking research result and a great model to experiment with — not as proof that a tiny model matches a frontier flagship in general use.

Pricing

Open weightsFree
  • MIT license
  • Weights and training code on Hugging Face and GitHub
  • Self-host on a single consumer GPU

As an open-weights model, VibeThinker-3B is free to download, run, and modify. Your only cost is the hardware (or rented GPU time) you run it on — which, at this size, is minimal compared to frontier models.

Strengths

  • Runs anywhere — about 6.7 gigabytes of GPU memory is enough, so it works on a single consumer card or a modest cloud instance
  • Strong reasoning for its size — competitive math and coding-reasoning scores that are unusual at 3 billion parameters
  • Fully open — MIT-licensed weights plus published training code make it easy to study and build on
  • A clean case study — the curriculum-plus-reinforcement recipe is a clear example of how post-training, not just scale, drives reasoning

Limitations and Considerations

  • Contested benchmarks — the most eye-catching scores are debated; verify on your own tasks before trusting them
  • Narrow strengths — tuned for math and coding reasoning; general knowledge, writing, and broad chat are not its focus
  • Small-model limits — 3 billion parameters cannot hold the world knowledge of a frontier model, and reasoning can break down outside its training distribution
  • Research artifact — released by a research team as a demonstration, not a supported commercial product with guarantees

Company Details

DetailInfo
DeveloperWeibo AI
ReleasedJune 2026
Parameters3.1 billion (dense)
Base modelQwen2.5-Coder-3B
LicenseMIT (open weights)
AvailabilityHugging Face, GitHub

Key Takeaways

  • VibeThinker-3B is a 3.1 billion-parameter open reasoning model from Weibo AI that posts frontier-level math and coding-reasoning scores for its size
  • It is built on Qwen2.5-Coder-3B and MIT-licensed, runnable on a single consumer GPU with about 6.7 gigabytes of memory
  • Its training recipe — curriculum fine-tuning, reinforcement learning, and self-distillation — shows how post-training can unlock reasoning without massive scale
  • The headline benchmark claims are strong but contested; treat it as a research demonstration and verify on your own tasks before relying on it

Save your progress & take the quiz

Sign up free to bookmark lessons, track which modules you've completed, and lock in what you learned with a quick knowledge-check quiz at the end of each lesson.

🧭Recommended for you