Learn About Groq's AI Products
Create a free account to access in-depth lessons on each tool and model.
Start Learning Free📋About Groq
Updated June 15, 2026Groq is an AI inference hardware and cloud company founded in 2016 by Jonathan Ross, who designed Google's first TPU. The company's custom Language Processing Units (LPUs) deliver inference speeds 10 to 18 times faster than GPU-based alternatives, powering real-time conversational and agentic AI through the GroqCloud inference platform.
In December 2025, NVIDIA entered a non-exclusive licensing agreement for Groq's inference technology valued at approximately $20 billion. Jonathan Ross and the bulk of Groq's senior chip-engineering leadership moved to NVIDIA, while GroqCloud was explicitly excluded from the deal and continues to operate independently. At GTC 2026, NVIDIA unveiled the Groq 3 LPU built on the licensed intellectual property — a clean separation of the chip lineage (now at NVIDIA) from the cloud service (still under Groq).
The remaining Groq is now run as an inference-cloud company: interim CEO Adam Winter and interim CFO Matt Eng (both formerly senior Groq finance and operations leaders) refocused the company's roadmap on GroqCloud and the on-demand inference market that sits beneath the application layer. The pitch is straightforward — inference is now a much larger market than training, and Groq's existing chip fleet plus the LPU-licensing royalty stream from NVIDIA gives the smaller team a credible wedge in the rapidly commoditizing inference-cloud category.
The company is raising a roughly $650 million round to fund this pivot. Existing investors are leading, with Disruptive and Infinitium committed to fill any unsubscribed shares. Cumulative funding now exceeds $2 billion, building on the $750 million round closed in September 2025 at a $6.9 billion valuation.
GroqCloud serves more than 2 million registered developers (with roughly 360,000 active monthly) and counts 75% of the Fortune 100 as account holders. The platform's headline workloads — real-time voice, agentic browser control, low-latency function calling — are exactly the categories where token-per-second economics matter most, and Groq's LPU advantage on those workloads remains intact after the NVIDIA deal. The longer-term question is whether Groq can hold the customer-facing inference category as NVIDIA, AWS Bedrock, Cerebras, and Together AI all push their own inference services using the same or similar chip generations.
🛠️Products & Tools (1)
Ultra-fast AI inference platform powered by custom LPU chips. Fastest token generation speeds in the industry for real-time applications. API access to major open-source models.
