📍 Shanghai, China·Est. 2023
StepFun logo
Private Company

StepFun

Shanghai-based open-weights AI lab building the Step series of multimodal foundation models.

Listen to this lesson

Free preview · first 0:30
0:00 / 0:30

Audio & video lessons are paid features

Plus unlocks audio streaming. Pro adds downloadable audio, video, certificates, and more.

Plus adds:
  • Audio streaming
  • Downloadable PDFs
  • All AI Playbooks
  • Personalized content
Pro also adds:
  • Certificates of completion
  • Audio MP3 downloads
  • Video lessonssoon
  • & More…soon

Watch this lesson

Video coming soon

Learn About StepFun's AI Products

Create a free account to access in-depth lessons on each tool and model.

Start Learning Free

📋About StepFun

Updated June 15, 2026

StepFun (also known as Stepfun) is a Shanghai-based AI lab founded in 2023, focused on open-weights multimodal foundation models. The lab is best known for its Step model series — Step-1, Step-2, Step-Audio, and the current flagship Step 3.7 Flash — released under the Apache 2.0 license alongside hosted inference on the StepFun Open Platform, OpenRouter, and NVIDIA NIM.

StepFun's design philosophy emphasizes practical agentic deployment: a sparse mixture-of-experts architecture that keeps active parameters small relative to total parameters, native vision-language capabilities, and strong tool-use reliability for coding and search workflows. The lab targets the open-weights tier alongside DeepSeek, Moonshot AI's Kimi, and Liquid AI rather than competing with frontier US labs on raw capability ceiling.

To broaden reach, StepFun partners with hosted inference providers — OpenRouter, NVIDIA NIM, DeepInfra, Fireworks AI, and Modal — so customers can consume Step models without self-hosting while still being able to download the open weights for full control. The lab's headline 2026 release, Step 3.7 Flash, is a 198 billion total parameter mixture-of-experts vision-language model with roughly 11 billion active parameters per token, a 256,000-token context window, and reported throughput up to 400 tokens per second.

🛠️Products & Tools (1)

Step 3.7 FlashOpen SourceFoundation Models & Open Source

StepFun's flagship 198-billion-parameter mixture-of-experts vision-language model with 256,000-token context, 400 tokens-per-second throughput, and Apache 2.0 open weights.

📰StepFun in the News

Showing the only story where StepFun is tagged in Top AI Stories.

Learn more about StepFun

Our curriculum includes an in-depth overview of StepFun's strategy, models, and competitive positioning.

View StepFun Overview Lesson