Liquid AI's open-weight LFM2.5 ships + Groq raises 650 million dollars
Liquid AI releases an open-weight 8 billion-parameter mixture-of-experts model trained on 38 trillion tokens. Groq lines up 650 million dollars to fund its inference-cloud pivot after Nvidia's talent deal. Plus 3 more stories.
Listen to this brief
Audio & video are paid features
Plus unlocks audio streaming and PDF downloads. Pro adds offline MP3 downloads, video, certificates, and more.
- Audio streaming
- Downloadable PDFs
- All AI Playbooks
- Personalized content
- Certificates of completion
- Audio MP3 downloads
- Video lessonssoon
- & More…soon
Watch this brief
Liquid AI extended its open-weight model line with an 8 billion-parameter mixture-of-experts trained on 38 trillion tokens and tuned for laptop and phone inference. Groq lined up 650 million dollars in new funding to refocus on its inference cloud, six months after Nvidia paid 20 billion dollars for senior Groq engineering talent. Mistral followed its Vibe launch with a Paris-area summit naming Airbus, BMW, and ASML as industrial customers for the new Physics AI line, and broke ground on a 10 megawatt inference data center south of Paris.
- 1
Liquid AI ships open-weight LFM2.5 model tuned for on-device inference
Liquid AI released LFM2.5-8B-A1B, a mixture-of-experts model with roughly 1 billion active parameters out of 8 billion total, pretrained on 38 trillion tokens and post-trained to a 128,000-token context window. The release is open-weight with no use restrictions and ships day-one support for llama.cpp, MLX, vLLM, SGLang, and ONNX runtimes. Liquid reports around 253 tokens per second on an Apple M5 Max, 146 on a Ryzen AI Max Plus, and roughly 30 tokens per second on flagship smartphones — putting credible laptop and phone inference into the same parameter band as the closed flagship-mini tier.
- 2
Groq raises 650 million dollars to fund inference cloud after Nvidia talent deal
Groq is raising 650 million dollars to refocus the company on its inference neocloud — the on-demand cloud platform powered by Groq's own AI chips — after a December 2025 arrangement saw Nvidia pay 20 billion dollars for senior Groq engineering talent and a hardware license. Existing investors are leading the round, with Disruptive and Infinitium committed to fill any unsubscribed shares. Adam Winter is now interim CEO and Matt Eng interim CFO. The company argues inference is a much bigger market than training right now, and the wedge for the smaller team that remains.
- 3
Mistral details industrial AI partnerships with Airbus, BMW, and ASML at Paris summit
At the AI Now Summit, Mistral named Airbus (commercial aircraft, helicopters, defense, and space), BMW Group (the carmaker's Large Industry Model initiative for multimodal reasoning on engineering data), and ASML (semiconductor parts design and surrogate modeling) as production customers for its new Mistral Physics AI line. The lab also confirmed a new 10 megawatt data center in Les Ulis south of Paris, dedicated to inference and scheduled to open in the third quarter of 2026, alongside an existing 40 megawatt Paris site and a planned Swedish facility — Mistral's answer to the inference supply-chain risk that has loomed over European AI customers.
- 4
Scott Wu: Devin commits 89 percent of Cognition's own code, agents augment not replace humans
Following Cognition's 1 billion dollar raise at a 26 billion dollar valuation announced May 27, CEO Scott Wu publicly reframed Devin's positioning as augmentation rather than replacement: "We've never thought about it as replacing humans. We are all programmers ourselves." Wu disclosed that 89 percent of Cognition's own engineering output is committed by Devin — with Windsurf, the acquired coding competitor, also contributing. He characterized Devin's current capability as somewhere between a junior and a mid-level engineer depending on task complexity, deliberately avoiding higher-level replacement claims.
- 5
Engineers push back on MCP — tool definitions eat 10.5 percent of context window
A widely shared engineering post from Quandri by backend engineer Chloe Kim argues that the Model Context Protocol is over-rated for most developer workflows. Kim instrumented their stack of MCP servers and found they consumed roughly 21,000 tokens — about 10.5 percent of Claude's 200,000-token context window — even when most tools sat unused. A single Linear-issue lookup ran around 12,957 tokens via MCP versus 200 tokens via direct command-line equivalents, a 65-times difference. Kim's recommendation is a Skills-style on-demand loading pattern paired with command-line-first integrations, reserving MCP for services without CLIs or where shared team authentication is essential.
Get Top AI Stories by email
The day's most important AI news in your inbox — free. Email delivery is launching soon; opt in now and we'll save your spot.
Sources
- 1.AI Now Summit 2026 — Mistral AI · May 28, 2026
- 2.LFM2.5-8B-A1B — Liquid AI · May 28, 2026
- 3.Notes from the Mistral AI Now Summit — Koen van Gilst · May 29, 2026
- 4.Cognition's Scott Wu says AI coding agents shouldn't replace humans — TechCrunch · May 29, 2026
- 5.After Nvidia's $20B not-acqui-hire, AI chip startup Groq reportedly raising $650M — TechCrunch · May 29, 2026
- 6.MCP is dead? — Quandri Engineering · May 29, 2026
This brief was published on May 30, 2026. Cited URLs above point to third-party publishers and may move, paywall, or be retired over time. If a link no longer resolves, original article titles are preserved so you can recover them via search; the canonical web edition at aiproplaybook.com/top-ai-stories/2026-05-30 may carry updated source URLs.