5.2 — China's Foundation Models

Learning Objectives

Identify the major Chinese AI companies and their leading foundation models
Explain why DeepSeek's efficiency breakthrough mattered so much to the global AI industry
Explain why China made open source a national strategy, and what that buys a country constrained on hardware
Assess how close open-weights models have come to the proprietary frontier — and where they still fall short
Understand the data privacy implications of using Chinese-hosted AI APIs vs. running Chinese open-source models locally

The Data Privacy Warning You Need to Read First

Before exploring Chinese AI models, there is an important distinction you must understand.

⚠️Warning

Data Privacy — Chinese-Hosted APIs: When you use a Chinese AI model through a hosted API — DeepSeek's API, Ernie Bot, Kimi API, etc. — your prompts and data are processed on servers in China. Under China's Cybersecurity Law and Data Security Law, Chinese companies may be required to provide data to government authorities. For personal use this may be acceptable. For business use involving proprietary information, client data, or sensitive content, this is a significant risk to evaluate carefully.

Security incidents reinforce this concern: In early 2025, security researchers at Wiz discovered a publicly accessible DeepSeek database containing over 1 million sensitive records — chat histories, API keys, and system logs — with zero authentication. This is a reminder that data security practices vary significantly across providers.

The alternative: Many Chinese models are also available as open-source weights. You can download and run DeepSeek, Qwen, and others locally — completely eliminating the data transmission concern. Running locally means your data never leaves your infrastructure.

This distinction matters enormously: the same model, run locally vs. through the Chinese API, has fundamentally different privacy properties.

Why Chinese AI Models Matter

Until late 2024, the assumption in the global AI industry was that frontier AI required American labs, American chips, American talent, and hundreds of millions of dollars in training compute. DeepSeek changed that assumption overnight.

China has an enormous AI ecosystem built across university research (Tsinghua, Peking University, Shanghai Jiao Tong), tech giants (Alibaba, Baidu, Tencent, ByteDance, Huawei), and well-funded startups. The models coming out of this ecosystem are not derivatives of US models — they are independently developed, often built under significant hardware constraints due to US chip export controls.

The result is a set of models that are competitive with US frontier models on key benchmarks, often with dramatically higher cost efficiency.

The Hardware Story Has Flipped — Inside China

The hardware-constraint framing also needs an update. As of May 2026, NVIDIA CEO Jensen Huang publicly acknowledged that NVIDIA has "largely conceded" the Chinese AI accelerator market — the company's share inside China is now near zero. After more than a year of US export controls and a stalemate where H200 customs clearance letters have sat unprocessed, Beijing's homegrown-stack push has paid off: Alibaba, ByteDance, and Tencent are all running production AI workloads on Huawei's Ascend silicon, and DeepSeek's V4 release in April 2026 was optimized for Ascend before CUDA. Huawei now expects roughly $12 billion in AI accelerator revenue in 2026 — up from $7.5 billion in 2025 — on the strength of those three customers alone. Inside China, the chip-export constraint that motivated DeepSeek's original efficiency push has effectively been replaced by a parallel domestic supply chain.

Open Source Is Now Chinese National Strategy

The most important development in this lesson is not a model. On July 17, 2026, Xi Jinping opened the World AI Conference in Shanghai and said China would seize a "rare historic opportunity" by "encouraging open source, openness, collaboration and sharing." It was the first time China's head of state framed open weights as national industrial policy rather than a choice individual labs were making.

Xi argued that AI development "should not be a solo performance by any single country but rather a symphony of global cooperation," and criticized broad national-security claims that let powerful nations restrict AI access for others — an unmistakable reference to US export controls. China also pledged 5,000 AI research and training placements for developing countries over the next five years.

Read the timing. Moonshot AI shipped Kimi K3 the day before — the largest open-weights model any Chinese lab has produced, and the first to beat a leading US proprietary flagship on most published coding benchmarks. The policy statement and the proof point landed within twenty-four hours of each other.

💡Key Concept

Why a country would make openness a strategy. Releasing frontier weights looks like giving away the asset. The strategic logic runs the other way. If the most capable freely-available models are Chinese, then developers, startups, and governments worldwide build on Chinese foundations — and standards, tooling, and dependencies follow. Open weights are also unblockable in a way that products are not: export controls can stop chips crossing a border, but not a file. For a country constrained on hardware and shut out of some markets, openness converts a weakness into distribution. Whether it works is an open question — but it explains why "open" here is a competitive posture, not an ideological one.

⚠️Warning

A speech is not a policy document. Xi articulated a direction; he did not publish enforceable rules, and Chinese labs remain subject to domestic content regulation and to a separate 2026 policy debate about restricting overseas access to top Chinese models. Hold both facts at once: the open-source push is real and evidenced by shipped models, and it coexists with a state apparatus that could constrain those same models tomorrow. Watch what ships, not only what is announced.

DeepSeek — The Efficiency Revolution

DeepSeek AI is a research lab founded by the Chinese hedge fund High-Flyer Capital Management, based in Hangzhou. It released a series of models beginning in 2024 that stunned the global AI community.

The DeepSeek Market Shock

On January 27, 2025, DeepSeek R1 triggered the largest single-day market loss in US stock market history. NVIDIA lost $589 billion in market value in a single day — more than double the previous record. The Nasdaq fell 3.1%, the S&P 500 dropped 1.5%, and approximately $1 trillion was wiped from US markets in total.

The trigger: DeepSeek demonstrated ChatGPT-level capabilities at a claimed training cost of $5.6 million, versus hundreds of millions for US competitors. The market question was existential: if frontier AI could be built cheaply, was the massive investment in AI infrastructure justified?

The answer, ultimately, was yes. Markets fully recovered. NVIDIA became the first company to reach a $5 trillion valuation by October 2025. Cheaper AI increased demand for AI compute rather than reducing it. But the DeepSeek shock permanently changed assumptions about the relationship between compute spending and AI capability.

DeepSeek V4 — Current Flagship (April 2026)

On April 24, 2026, DeepSeek released V4-Pro (1.6 trillion total parameters, 49 billion active, mixture-of-experts) and V4-Flash (284 billion total / 13 billion active), both MIT-licensed and shipping with a 1 million-token context window — the first DeepSeek models to match Claude and Gemini on context length. V4-Pro is the largest open-weights model ever released. The accompanying paper claims V4-Pro uses approximately 27% of V3.2's FLOPs and 10% of the KV cache at 1 million-token context — meaningful both for training cost and on-device inference.

API pricing positions V4-Pro and V4-Flash significantly below US frontier rivals:

V4-Pro: $1.74 / $3.48 per million input/output tokens — undercuts Claude Sonnet and the larger GPT-5.4 tier
V4-Flash: $0.14 / $0.28 per million input/output tokens — cheapest frontier-adjacent model in the market

Both are available via DeepSeek's API, on Hugging Face for self-hosting, and through third-party providers (Together.ai, Fireworks AI, Groq). V4-Flash scores 50 on the Artificial Analysis Intelligence Index, third among open-weights models — the clearest single anchor for where the open Chinese stack sits against the proprietary frontier.

DeepSeek R2 never shipped as a standalone model. The reasoning engine it was meant to carry landed inside V4 instead, so the lab's reasoning lineage runs R1 to R1-0528 to V4 rather than through a separate R-series flagship.

First Outside Venture Round — AGI Mandate (June 2026)

DeepSeek closed its first outside venture round in June 2026, raising approximately 50 billion yuan (about $7.4 billion) — the first external capital the lab had ever taken. Commercial investors were led by Tencent, at 10 billion yuan, and battery maker CATL, at 5 billion yuan, with NetEase and JD.com also participating. Founder Liang Wenfeng contributed another 20 billion yuan from his own holdings.

The terms are unusual. Commercial backers accepted five-year lockups and no voting rights, and the only investor granted governance rights and a direct stake was Beijing's National Artificial Intelligence Industry Investment Fund, China's state-backed strategic AI vehicle (distinct from the chip-focused "Big Fund"). Liang had not previously sought outside capital, and the round is framed as a way to offer employee equity and retain talent against intensifying domestic competition — without loosening his grip on direction. He has said publicly that the lab will keep developing open-source models and pursue artificial general intelligence as its core goal, resisting the usual pressure to chase near-term commercialization.

📝Note

What this means: The round places DeepSeek alongside frontier US labs by valuation, even though its training spend remains a fraction of theirs — a clearer pricing signal than the V3.2 efficiency narrative alone could provide. State-aligned capital plus an explicit open-source AGI mandate is a distinctive posture: most US frontier labs draw their largest checks from corporate cloud partners (OpenAI and Microsoft, Anthropic and Amazon) and treat AGI claims with strategic ambiguity. DeepSeek is telling its investors the opposite — and doing it with a cap table engineered so that the founder and the Chinese state, not the commercial backers, hold the votes.

DeepSeek V4-Pro

DeepSeek AI

Open Source

Strengths

Largest open-weights model ever; 1.6 trillion total / 49 billion active; 1 million context; ~27% of V3.2's FLOPs; MIT license

Context Window

1 million tokens

Pricing

$1.74/$3.48 per million tokens via API; free self-hosted

www.deepseek.com →

DeepSeek V4-Flash

DeepSeek AI

Open Source

Strengths

284 billion total / 13 billion active mixture-of-experts; 1 million context; cheapest frontier-adjacent API; MIT license

Context Window

1 million tokens

Pricing

$0.14/$0.28 per million tokens via API; free self-hosted

www.deepseek.com →

DeepSeek R1 — The Open-Source Reasoning Breakthrough

DeepSeek R1 was the first open-source reasoning model to match OpenAI's o1 on challenging benchmarks (AIME math, GPQA science).

Reasoning models spend more compute at inference time "thinking through" a problem before responding — similar to how a human might work through a math proof step by step rather than guessing immediately. DeepSeek R1 demonstrated that this "extended thinking" capability — previously exclusive to OpenAI's o1 — could be replicated and open-sourced.

R1-0528 (May 2025) was a major revision that improved logic and programming benchmarks, and added JSON output and function-calling capabilities. Distilled versions (1.5 billion, 7 billion, 14 billion, 32 billion, 70 billion parameters) allow the reasoning capability to run on modest hardware.

DeepSeek R1

DeepSeek AI

Open Source

Strengths

First open-source reasoning model matching OpenAI o1; distilled 1.5 billion-70 billion variants; R1-0528 adds function-calling; MIT license

Context Window

128K tokens

Pricing

Free (self-hosted); available via API

www.deepseek.com →

Previous Generation — V3.2 and V3.2-Speciale

V3.2 was DeepSeek's flagship before V4 shipped in April 2026. It is still widely deployed and remains available via API for cost-sensitive workloads, but new builds should target V4.

DeepSeek V3.2 is a 671-billion-parameter mixture-of-experts model released under the MIT license. What made it remarkable in early 2025:

Training cost: Approximately $5.9 million in compute — compared to estimates of $100 million+ for comparable US models
Performance: At release it ranked as the #2 most intelligent open-weight model on the Artificial Analysis Intelligence Index, ahead of Grok 4 and Claude Sonnet 4.5 (Thinking) — a position it has since given up to newer open-weights releases, including DeepSeek's own V4 line
Coding: 40%+ improvement on SWE-bench Verified over V3.1
MIT license: Completely free for commercial and research use

V3.2-Speciale (December 2025) pushed even further — a high-compute variant that won a gold medal at the 2025 International Mathematical Olympiad (35/42 points), placed 10th at the International Olympiad in Informatics, and scored 96.0% on AIME (vs. GPT-5-High's 94.6%). The Speciale API was only available until December 15, 2025, due to extreme compute costs.

💡Key Concept

The DeepSeek Efficiency Questions: The $5.9 million V3.2 figure refers to the compute cost using available (non-restricted) NVIDIA H800 chips. It does not include the full cost of prior research, failed experiments, engineering talent, and infrastructure. The number is real but the total investment in building DeepSeek's capability was much larger. Still — the efficiency of the resulting model was genuine and significant, and V4-Pro's claimed 27% FLOPs reduction over V3.2 extends the lineage.

DeepSeek V3.2 (previous generation)

DeepSeek AI

Open Source

Strengths

671 billion mixture-of-experts; matches GPT-4o+; IMO gold medal (Speciale variant); trained for ~$5.9 million; MIT license

Context Window

128K tokens

Pricing

Free (self-hosted); $0.27/$1.10 per million tokens via API

www.deepseek.com →

DeepSeek Bans and Restrictions

DeepSeek's success has been accompanied by significant regulatory pushback:

Italy blocked DeepSeek from app stores (January 2025) over GDPR failures
Banned on government devices in South Korea, Australia, and Taiwan
US restrictions: banned by NASA, the US Navy, the Pentagon, and on Texas state government devices; restricted in US House offices

These bans target the hosted API (where data flows to Chinese servers), not the open-source weights that can be run locally.

Alibaba — Qwen Series

Alibaba DAMO Academy has built one of the broadest open-source model portfolios in the world with the Qwen (通义千问, Tongyi Qianwen) series.

Qwen3.5

The Qwen3.5 family (February–March 2026) was the family's early-2026 open-weight generation, spanning from 0.8 billion to 397 billion parameters.

The flagship is Qwen3.5-397 billion-A17 billion — a 397 billion total parameter MoE model with only 17 billion active per forward pass. It uses a novel Gated DeltaNet + MoE architecture that alternates linear and full attention in a 3:1 ratio for efficiency.

Key characteristics:

262K native context window, extensible to 1 million tokens
100+ languages — broader multilingual coverage than most US models, with particular strength in Chinese, Japanese, Korean, Arabic, and less-resourced languages
The 9 billion model matches or surpasses GPT-OSS-120 billion (13-times its size) on GPQA Diamond and MMMU-Pro benchmarks — exceptional efficiency
The 122 billion-A10 billion variant scores 72.2 on tool use benchmarks (vs. GPT-5 mini's 55.5)
Released under Apache 2.0 license

The Qwen3.5 family includes medium models (27 billion, 35 billion, 122 billion) and small models (0.8 billion, 2 billion, 4 billion, 9 billion), covering everything from on-device deployment to large-scale server inference.

Qwen3.6, 3.7, and the shift to closed flagships

Alibaba iterated fast after Qwen3.5, and in the process changed its release strategy. Qwen3.6 (April 2026, Apache 2.0) continued the open-weight line — a 27-billion-parameter dense model plus a 35-billion-A3-billion mixture-of-experts model, with the compact 27 billion matching or beating the far larger Qwen3.5 on agentic coding. But the top-tier flagships went proprietary: Qwen3.7 Max (a text reasoning and agentic model, May 2026) and Qwen3.7 Plus (a multimodal agent, generally available since June 2026) are API-only, with a 1 million token context window. In July 2026, Alibaba previewed Qwen3.8 Max — a reported 2.4 trillion parameter multimodal model it positions as second only to Anthropic's Claude Fable 5, though it shipped without benchmarks, without open weights (promised "soon"), and without disclosing its active-parameter count.

So the current picture is two-track: Qwen3.7 is the generally available flagship (proprietary, API-only), Qwen3.6 leads the open-weight line, and Qwen3.8 Max is a preview. This mirrors a broader move among Chinese labs to keep frontier flagships closed while still publishing strong open weights a tier below.

QwQ-32 billion

QwQ-32 billion is Alibaba's reasoning-specialized model at 32 billion parameters. It is designed for mathematical reasoning and step-by-step problem solving, competing directly with DeepSeek R1's smaller distilled variants.

Qwen Wins the Apple Slot (July 2026)

On July 15, 2026, China's Cyberspace Administration cleared Apple Intelligence for release in China, with Qwen as the model handling text and image understanding and generation across iOS, iPadOS, macOS, and visionOS for Chinese users. Baidu contributes a smaller model alongside it. Apple evaluated Baidu, DeepSeek, and ByteDance before settling on Alibaba.

This is the most consequential distribution win any Chinese model has achieved. Elsewhere in the world, the rebuilt Siri's cloud requests run on a custom Google Gemini model — but Chinese regulation requires a domestically approved model, so inside China the brain of Apple's AI is Qwen. That places Alibaba's models in front of one of the largest premium device populations on earth, in Apple's biggest market outside the United States.

💡Key Concept

Why this matters beyond Alibaba. The usual read on Chinese models is that they compete on price and openness while US labs hold the frontier. The Apple deal complicates that. A US company chose a Chinese model not because it was cheaper but because regulation made it the only viable option — and in doing so validated Qwen as production-grade for hundreds of millions of premium users. Regulatory geography, not benchmark scores, decided which model ships.

⚠️Warning

Approval is not a ship date. Regulatory clearance is a precondition. The rollout still depends on security review, engineering adaptation, and operating-system updates, and Apple has announced no date.

Qwen3.5 (397 billion-A17 billion)

Alibaba DAMO Academy

Open Source

Strengths

Apache 2.0; 100+ languages; Gated DeltaNet+MoE; 262K-1 million context; 0.8 billion-397 billion range

Context Window

262K-1 million tokens

Pricing

Free (self-hosted via Hugging Face or Ollama)

huggingface.co/Qwen →

Moonshot AI — Kimi

Moonshot AI is a Beijing-based startup founded in 2023, focused on long-context applications.

Kimi K2.5 (January 2026) was a major leap from the earlier K2:

1 trillion parameters MoE architecture with 32 billion active per forward pass
256K context window (doubled from K2's 128K)
Trained on 15 trillion mixed visual and text tokens — natively multimodal
Outperforms Gemini 3 Pro on SWE-Bench Verified and beats GPT 5.2 on SWE-Bench Multilingual
Beats GPT 5.2 and Claude Opus 4.5 on VideoMMU (video understanding)

Moonshot also released Kimi Code — an open-source coding tool with integrations for terminals, VS Code, Cursor, and Zed.

Kimi K2.6 + $2 Billion Round (May 2026)

On May 7, 2026, Moonshot AI shipped Kimi K2.6 alongside a $2 billion funding round at a $20 billion valuation. The round was led by Meituan's Long-Z Investments arm, with Tsinghua Capital, China Mobile, and CPE Yuanfeng participating. Moonshot reported $200 million in annualized recurring revenue in April 2026 — the kind of consumption disclosure most Chinese labs do not make publicly.

The Moonshot round landed in the same stretch of weeks that DeepSeek was assembling its own first outside round (covered above), which closed the following month. Together, the two raises signalled that open-weights labs out of China are emerging as the primary cost-pressure on US frontier vendors — DeepSeek leading on raw scale and government-aligned capital, Moonshot on direct-API consumption volume and OpenRouter ranking.

K2.6 retains K2.5's 1 trillion parameter MoE architecture (32 billion active per token), extends the context window to 262K tokens, adds native INT4 quantization, and introduces an Agent Swarm system that scales to 300 sub-agents and 4,000 coordinated steps in 12-hour autonomous coding sessions. Open-weights landed on Hugging Face on April 20, 2026 under a Modified MIT License. See Kimi K2.6 (Moonshot AI) for the full architecture and Cursor's deliberate K2.5 + RL versus K2.6 generation-skip decision.

Kimi K3 — The Open Frontier Arrives (July 2026)

On July 16, 2026, Moonshot shipped Kimi K3 and K2.6 became the prior generation. K3 is the most consequential open-weights release to date, and it is worth being precise about why.

K3 is a 2.8 trillion parameter mixture-of-experts (MoE) model that activates just 16 of 896 experts per token — far sparser than the K2 family — with a 1 million token context window and native vision. It is built on two techniques Moonshot developed in-house: Kimi Delta Attention, the basis for the million-token context, and Attention Residuals, which retrieves representations selectively across model depth rather than accumulating them uniformly.

The headline is not the size. It is that K3 beats Claude Opus 4.8 on most of the published coding and agentic suite — most dramatically on FrontierSWE, 81.2 against 66.7. For roughly two years, the open-weights argument carried an implicit concession: open models were cheaper and more private, but a generation behind. K3 is the strongest evidence yet that the concession is narrowing.

Two things keep this from being a clean victory lap, and both matter:

K3 does not beat the current frontier. Moonshot says so itself — K3 "still trails the most powerful proprietary models, Claude Fable 5 and GPT 5.6 Sol," and names user experience, not capability, as the remaining gap. On HLE-Full, a hard reasoning benchmark, K3 scores 43.5 against Fable 5's 53.3.
Open weights are not free to run. A 2.8 trillion parameter model is far beyond consumer hardware — the published checkpoint spans 96 shards and well over a terabyte. Most teams will consume K3 through a hosted endpoint anyway, which puts the data-residency question right back where it started.

K3's pricing signals the shift most clearly. At $15 per million output tokens, Moonshot is no longer undercutting the frontier the way earlier Chinese open models did. It is pricing like a frontier lab.

💡Key Concept

"Open weights" is not the same as "open source." Moonshot released K3's full weights on July 27, 2026 — meaning you can download the trained model, run it, and fine-tune it. That is not the same as open-source software: you do not get the training data, the training code, or the right to any and every use.

K3 makes the distinction unusually concrete. The K2 family shipped under a Modified MIT License, but K3 landed under a custom Kimi K3 License: running an inference service whose revenue exceeds $20 million over 12 consecutive months requires a separate agreement with Moonshot, and products above 100 million monthly active users must display "Kimi K3" in the interface. Downloadable is not the same as unrestricted — always read the license before you build a commercial plan on an open-weights model.

Kimi K3

Moonshot AI

Open Source

Strengths

2.8 trillion MoE (16 of 896 experts active); 1 million context; native vision; beats Claude Opus 4.8 on most coding benchmarks; largest open-weights model from a Chinese lab

Context Window

1 million tokens

Pricing

Free tier on kimi.com; API at 30 cents per million cached input and $15 per million output; weights public since July 27, 2026 under the Kimi K3 License

Moonshot AI →

Baidu — ERNIE 5.0

Baidu is China's leading search engine company, analogous to Google. Its ERNIE (文心一言, Wenxin Yiyan) series has reached its fifth generation.

ERNIE 5.0 (November 2025, announced at Baidu World 2025) is a fundamental upgrade:

2.4 trillion parameters — a unified multimodal model integrating text, image, video, and audio in a single autoregressive framework
Comparable to Gemini-2.5-Pro and GPT-5-High on 40+ authoritative benchmarks
Dominant performance on Chinese-language benchmarks
Real-time search grounding via Baidu's search integration
Closed source; available via Baidu Cloud API

Baidu is also developing its own AI chips: Kunlunxin M100 (optimized for inference, releasing early 2026) and Kunlunxin M300 (for training/inference of ultra-large models, following in early 2027).

Best for: Applications where Chinese-language fluency is paramount and integration with Chinese-language web search adds value.

ERNIE 5.0

Baidu

Closed

Strengths

2.4 trillion params; unified multimodal (text/image/video/audio); Chinese-language leader; Baidu search integration

Context Window

Not disclosed

Pricing

Baidu Cloud API

Baidu →

Zhipu AI — GLM-5

Zhipu AI is a spinout from Tsinghua University's AI lab. In January 2026, Zhipu made history as the first publicly listed Chinese AI foundation model company, raising approximately $558 million in a Hong Kong IPO.

GLM-5 (February 2026) is a generational leap from GLM-4.5:

744 billion total parameters MoE, with 256 experts and 8 activated per token (~44 billion active)
Uses DeepSeek Sparse Attention (DSA) for efficiency
200K context window
MIT license (fully open source)
Claims to surpass Gemini 3 Pro on coding and agentic performance
Trained entirely on Huawei Ascend chips using MindSpore framework — zero NVIDIA dependency

GLM-5 is significant not just for its performance, but because it proves that frontier-class models can be trained without any NVIDIA hardware — a milestone for Chinese AI independence.

GLM-5-Turbo (March 2026) is optimized specifically for automated agent workflows.

GLM 5.2 (June 2026) extends the line with a 1 million token context window and a coding-first focus, rolling out across Zhipu's GLM Coding Plan tiers and pitched as a permissively licensed alternative to Claude Code and GPT-5.5 for the Asia-Pacific market. A standalone API and MIT-licensed open weights are slated to follow within days — though Zhipu released no benchmarks at launch, so its performance claims remain unverified for now.

GLM-5

Zhipu AI

Open Source

Strengths

744 billion MoE (44 billion active); 200K context; MIT license; trained entirely on Huawei Ascend; competitive with Gemini 3 Pro

Context Window

200K tokens

Pricing

Free (self-hosted); API available

Zhipu AI →

ByteDance — Doubao

ByteDance (the company behind TikTok) has quietly built the most-used AI chatbot in China with Doubao (豆包).

Doubao 2.0 / Seed 2.0 (February 2026) comes in four variants — Pro, Lite, Mini, and Code — with the Pro variant delivering:

98.3% on AIME 2025 (math), a 3020 Codeforces rating (competitive programming), and 89.5 on VideoMME (video understanding)
Performance matching GPT 5.2 and Gemini 3 Pro at roughly 1/10th the cost
Doubao has exceeded 100 million daily active users and 155 million weekly active users

ByteDance's strategy is distinctive: rather than competing for frontier research prestige, it leverages its massive TikTok/Douyin distribution network to put AI in the hands of hundreds of millions of users at extremely low prices.

Doubao 2.0 Pro

ByteDance

Closed

Strengths

98.3% AIME; 100 million+ DAU; GPT-5.2-competitive at 1/10th cost; most-used AI chatbot in China

Context Window

Not disclosed

Pricing

API available; consumer app free

ByteDance →

MiniMax — Consumer AI Powerhouse

MiniMax is a Shanghai-based AI company that IPO'd in Hong Kong in January 2026, raising approximately $620 million — with its stock surging 109% on debut. Backed by Alibaba, Tencent, and Abu Dhabi Investment Authority.

MiniMax M2.7 (March 2026) is its latest model. MiniMax is known for consumer-facing AI products including AI companion apps and creative tools, with over 100 million users.

MiniMax and Zhipu AI were among the first of China's "Six Tigers" — the six leading Chinese AI startups (Zhipu, Moonshot, MiniMax, Baichuan, StepFun, 01.AI) — to go public. Notably, 01.AI stopped pre-training large models in March 2025, pivoting to selling business solutions using DeepSeek's models — a significant strategic retreat that highlighted the brutal economics of competing in foundation models.

MiniMax M2.7

MiniMax

Closed

Strengths

Consumer AI focus; Hong Kong IPO ($620 million raised); 100 million+ users; backed by Alibaba/Tencent

Context Window

Not disclosed

Pricing

API available; consumer apps free

MiniMax →

Ant Group — InclusionAI / Ling Series

Ant Group is the financial-services affiliate of Alibaba (operator of Alipay) and one of the largest fintech companies in the world. Ant's AGI initiative, InclusionAI, sits inside the firm and ships open-weights frontier models alongside its applied financial-AI work — putting Ant in a different posture than DeepSeek (a hedge-fund spin-out) or Alibaba's Qwen team (a hyperscaler model lab).

In early May 2026, InclusionAI published Ling-2.6 — including a 1 trillion parameter flagship variant — on Hugging Face under the MIT license. The architecture is a hybrid of Multi-head Latent Attention and Linear Attention with a 262,144-token context window. Headline benchmark: 72.2 on SWE-bench Verified, among the strongest scores any open-weights model has posted on a coding evaluation. Inference requires tensor parallelism across 8 GPUs. A companion hosted-only sibling, Ring 2.6 at the same trillion-parameter scale, surfaced on OpenRouter at the same time.

Ling-2.6 strengthens the broader pattern from this module: Chinese labs are not just shipping models that compete on benchmarks — they're shipping them under permissive licenses that anyone can self-host and inspect. That's a different distribution strategy from US frontier labs, where flagship weights are closed and only research-grade or older-generation models go open. For developers and security-conscious enterprises that need to run models inside their own perimeter, the open-weights Chinese stack is increasingly the only option at the trillion-parameter scale.

Ling-2.6 (1T variant)

Ant Group

Closed

Strengths

1 trillion parameter open weights under MIT license; 72.2 on SWE-bench Verified; 262,144-token context; hybrid MLA + Linear Attention; from Ant Group's InclusionAI lab

Context Window

262,144 tokens

Pricing

Free open weights; tensor parallelism across 8 GPUs required for inference

Ant Group →

The US Chip Export Controls Context

Understanding Chinese AI development requires understanding the constraint it operates under: US export controls.

Since 2022, the US Bureau of Industry and Security (BIS) has progressively restricted exports of advanced AI chips to China. The original H100 and H200 remain banned. In December 2025, the Trump administration partially reversed course by allowing H200 exports to China with a 25% surcharge (up to 80,000 chips) — better to sell older chips with surcharges than have China build its own. Blackwell-class chips (B100, B200, GB200) and next-generation Rubin chips remain fully restricted for 18–24 months after domestic launch.

China has responded on multiple fronts:

Huawei Ascend chips are rapidly advancing:

Ascend 910C: 600,000 units planned for 2026 (double 2025 output), delivering approximately 60% of H100 inference performance
Ascend 920: 6nm process, exceeding 900 teraflops per card
Ascend 950PR (debuted March 2026): 1.56 petaflops FP4, featuring Huawei's in-house HBM memory and claiming 2.8-times H20 performance
Atlas 950 Supercomputer: 8,192 Ascend 950 processors, 16 exaflops performance
Huawei's roadmap targets 4 zettaflops FP4 performance by 2028

Beijing's reciprocal talent controls are mirroring US chip export logic in reverse:

State-secret travel restrictions originally placed on senior DeepSeek researchers have been extended to AI talent at Alibaba and other private firms, with Bloomberg first reporting the expansion in late May 2026
Some AI professionals working on strategically important projects now require official government approval before traveling abroad
The policy frames frontier-AI researchers themselves as restricted assets — a national-security treatment of human capital that parallels the US treatment of advanced chips
Practical implications include constraints on international conference attendance, lab visits, and cross-border collaboration — a soft decoupling at the talent layer that compounds the already-decoupling at the supply layer

💡Key Concept

Two-way decoupling. Until recently, US chip export controls were the dominant story about AI-trade controls between the two countries. The May 2026 talent-travel expansion makes the regime explicitly bidirectional — both governments now treat frontier-AI inputs (chips on one side, researchers on the other) as restricted strategic assets. For Chinese labs, the practical effect is fewer published international conference talks, slower talent rotation through global labs, and a stronger pull toward keeping frontier work entirely inside the domestic ecosystem.

The semiconductor gap is real but narrowing. GLM-5's successful training entirely on Ascend hardware proves that Chinese labs can build frontier models without NVIDIA, and the DeepSeek efficiency story demonstrates that algorithmic innovation can partially compensate for hardware constraints. Beijing's reciprocal talent controls compound the same logic at the human-capital layer — what started as a one-way US export-control regime is hardening into a two-way decoupling that binds frontier-AI talent on both sides.

Key Takeaways

Chinese foundation models are genuinely competitive — not derivative of US models, independently built under hardware constraints
DeepSeek V4-Pro and V4-Flash are the current flagships (April 24, 2026, MIT-licensed, 1 million-token context); V4-Pro is the largest open-weights model ever released at 1.6 trillion total / 49 billion active mixture-of-experts, and V4-Flash scores 50 on the Artificial Analysis Intelligence Index, third among open-weights models. DeepSeek closed its first outside venture round in June 2026, raising roughly 50 billion yuan (about $7.4 billion) from Tencent, CATL, NetEase, and JD.com, with founder Liang Wenfeng contributing 20 billion yuan himself and publicly committing the lab to open-source AGI as its core goal. R2 never shipped as a standalone model — that reasoning work went into V4
DeepSeek V3.2 (MIT license, previous generation) was the model that first matched frontier performance and was trained for a fraction of comparable US model costs — its January 2025 release triggered a $589 billion single-day NVIDIA crash that permanently changed AI economics assumptions
Data privacy is a critical distinction: Chinese-hosted APIs transmit data to Chinese servers subject to PRC law; running open-source weights locally eliminates this concern. Multiple countries have banned DeepSeek's hosted API on government devices
The Qwen family (100+ languages, 1 million context, open weights through 3.6 plus proprietary Qwen3.7 and 3.8 Max flagships), Kimi K2.5 (1 trillion params), ERNIE 5.0 (2.4 trillion multimodal), and GLM-5 (744 billion, Ascend-only training) represent rapid advancement across the Chinese AI ecosystem
China's Cyberspace Administration cleared Apple Intelligence on July 15, 2026 with Alibaba's Qwen as the model powering it for Chinese users and Baidu contributing a smaller model — the most consequential distribution win any Chinese model has achieved, and a case where regulatory geography rather than benchmark scores decided which model ships
Moonshot AI shipped Kimi K3 on July 16, 2026 — a 2.8 trillion parameter mixture-of-experts model activating 16 of 896 experts per token, with a 1 million token context and native vision. It is the largest open-weights model any lab has published, and it beats Claude Opus 4.8 on most published coding benchmarks (FrontierSWE: 81.2 against 66.7). The weights went public on July 27, 2026 under a custom Kimi K3 License that is notably more restrictive than the K2 family's Modified MIT — commercial inference hosting above $20 million in revenue requires a separate Moonshot agreement
The open-weights concession is narrowing, not gone. Moonshot concedes K3 still trails Claude Fable 5 and GPT-5.6 Sol overall and names user experience as the gap; on HLE-Full reasoning it scores 43.5 against Fable 5's 53.3. And at $15 per million output tokens, K3 is priced like a frontier model, not as a discount play
Xi Jinping made open source official national strategy at the World AI Conference in Shanghai on July 17, 2026 — the first time China's head of state framed open weights as industrial policy — one day after Kimi K3 shipped as the proof point. The strategic logic: if the most capable freely-available models are Chinese, the world's developers build on Chinese foundations, and weights cannot be stopped at a border the way chips can
Moonshot AI previously shipped Kimi K2.6 + a $2 billion raise at a $20 billion valuation on May 7, 2026, with $200 million annualized recurring revenue. Combined with DeepSeek's own maiden round closing the following month, the two raises signalled China's open-weights labs were becoming the primary cost-pressure on US frontier API pricing
ByteDance Doubao is China's most-used AI chatbot with 100 million+ DAU, while MiniMax and Zhipu AI have IPO'd in Hong Kong
Huawei Ascend chips are advancing rapidly toward NVIDIA parity, with a roadmap to 4 zettaflops by 2028
Beijing has extended state-secret travel restrictions originally placed on senior DeepSeek researchers to AI talent at Alibaba and other private firms (Bloomberg, late May 2026), requiring official approval before international travel — a reciprocal-controls signal that mirrors US chip export logic in reverse and hardens a two-way decoupling at the human-capital layer

China's Foundation Models

Audio & video lessons are paid features

Learning Objectives

The Data Privacy Warning You Need to Read First

Why Chinese AI Models Matter

The Hardware Story Has Flipped — Inside China

Open Source Is Now Chinese National Strategy

DeepSeek — The Efficiency Revolution

The DeepSeek Market Shock

DeepSeek V4 — Current Flagship (April 2026)

First Outside Venture Round — AGI Mandate (June 2026)

DeepSeek R1 — The Open-Source Reasoning Breakthrough

Previous Generation — V3.2 and V3.2-Speciale

DeepSeek Bans and Restrictions

Alibaba — Qwen Series

Qwen3.5

Qwen3.6, 3.7, and the shift to closed flagships

QwQ-32 billion

Qwen Wins the Apple Slot (July 2026)

Moonshot AI — Kimi

Kimi K2.6 + $2 Billion Round (May 2026)

Kimi K3 — The Open Frontier Arrives (July 2026)

Baidu — ERNIE 5.0

Zhipu AI — GLM-5

ByteDance — Doubao

MiniMax — Consumer AI Powerhouse

Ant Group — InclusionAI / Ling Series

The US Chip Export Controls Context

Key Takeaways

Save your progress & take the quiz

Tools Covered in This Lesson