Learn About Modular's AI Products
Create a free account to access in-depth lessons on each tool and model.
Start Learning Free📋About Modular
Updated June 15, 2026Modular is an AI-native developer platform company founded in January 2022 by Chris Lattner and Tim Davis. Lattner is the creator of the LLVM compiler infrastructure, the Swift programming language, and previously led Google's TPU and ML platform work — Modular is his bid to fix the long-standing fragmentation between the languages developers want to write (Python, mostly) and the heterogeneous hardware those workloads ultimately run on (CPUs, NVIDIA GPUs, AMD GPUs, custom AI accelerators). The company has raised over 130 million dollars from General Catalyst, Greylock, GV, and Scale Venture Partners, with its headquarters in Los Altos, California.
Modular's two flagship products are the Mojo programming language and the MAX inference engine. Mojo is a Python-superset designed from the ground up for AI workloads — it preserves Python syntax and ecosystem interop while adding compile-time metaprogramming, manual memory management when you need it, and direct access to GPU and accelerator programming primitives. The selling line "write like Python, run like C++" captures the design intent: developers can incrementally optimize hot paths in their existing Python code without a rewrite, then target the same source across CPUs, GPUs, and ASICs without vendor lock-in. MAX is the inference engine that runs models written in (or compiled to) Mojo on diverse hardware.
On May 7, 2026, Modular shipped Mojo 1.0.0b1 — the first beta of the language — alongside a public commitment to open-source the compiler later in 2026. The standard library is already open source on GitHub. Beta status is the threshold most enterprises wait for before adopting a new language toolchain in production, making the 1.0 Beta release a meaningful inflection point: Mojo is moving from research curiosity toward production candidate for AI-developer tooling. The strategic bet is that as AI workloads continue to dominate compute spend, the language layer that lets a single team target multiple hardware backends (avoiding the CUDA-only or accelerator-specific lock-in pattern) becomes increasingly valuable to both hyperscalers and enterprise AI platform teams.
