Free to read. Sign up to save your progress and take knowledge-check quizzes.

Sign up free
6 min read·Updated July 1, 2026

Etched Sohu

Etched logoBy Etched

Etched Sohu is an inference chip purpose-built for the transformer architecture — an application-specific integrated circuit that hard-wires transformer inference on TSMC's 4-nanometer process for far higher speed and lower cost than general-purpose GPUs. Etched has booked more than $1 billion in orders and is backed by Jane Street, Hudson River Trading, and angels including Andrej Karpathy and Geoffrey Hinton.

Listen to this lesson

Free preview · first 0:30
0:00 / 0:30

Audio & video lessons are paid features

Plus unlocks audio streaming. Pro adds downloadable audio, video, certificates, and more.

Plus adds:
  • Audio streaming
  • Downloadable PDFs
  • All AI Playbooks
  • Personalized content
Pro also adds:
  • Certificates of completion
  • Audio MP3 downloads
  • Video lessonssoon
  • & More…soon

Watch this lesson

AI Pro Playbook video — coming soon

Learning Objectives

  • Understand what makes Sohu different from a general-purpose GPU
  • Explain why specializing a chip for the transformer architecture can improve inference speed and cost
  • Identify where Etched fits among AI inference challengers to Nvidia

What Is Etched Sohu?

Etched Sohu is an AI inference chip built by Etched, an American semiconductor startup founded in 2022 by Harvard dropouts Gavin Uberti, Robert Wachen, and Chris Zhu. Where most AI chips — including Nvidia's GPUs — are general-purpose processors that can run any kind of model, Sohu makes a deliberate bet in the opposite direction: it is designed to run one thing extremely well.

That one thing is the transformer, the neural-network architecture behind virtually every modern large language model, from GPT and Claude to Gemini and Llama. Sohu is an application-specific integrated circuit (ASIC) — a chip whose logic is hard-wired for a fixed task rather than programmable for many. Etched hard-wires the matrix-multiplication patterns specific to transformer inference directly into silicon, fabricated on TSMC's 4-nanometer process.

💡Key Concept

Specialization versus flexibility: A GPU is a Swiss-army knife — it can train and run any model, but it spends transistors and energy on that flexibility. An ASIC like Sohu is a single-purpose tool: it can only run transformers, but because it does nothing else, far more of the chip is dedicated to the actual work. The bet is that the transformer has won decisively enough that specializing for it is worth giving up the flexibility.

Why a Transformer-Only Chip?

The economics of AI have shifted. Training a frontier model is a one-time cost; inference — actually running the model to answer queries — happens billions of times and now dominates the ongoing cost of operating AI at scale. That makes inference efficiency the battleground.

Etched's thesis is that once an architecture becomes as dominant as the transformer, the industry can afford to bake it into hardware. By removing the general-purpose overhead of a GPU, Sohu aims to deliver substantially more throughput per dollar and per watt on transformer inference specifically. Etched sells both the chips and full frontier inference clusters — turnkey systems that pair Sohu chips with custom racks and software so a customer can deploy inference capacity without assembling it piece by piece.

⚠️Warning

The risk of specialization: A transformer-only chip is a bet on the transformer staying dominant. If a fundamentally different architecture displaces it, a general-purpose GPU can adapt where a hard-wired ASIC cannot. Etched is wagering that the transformer's lead is durable enough to make that risk worth taking.

Traction and Backing

Etched has booked more than $1 billion in orders for its inference systems and is one of the most closely watched challengers in AI hardware. Its investors include quantitative-trading firms Jane Street, Hudson River Trading, and Two Sigma, along with Ribbit Capital, and its angel roster reads like a who's-who of AI: Andrej Karpathy, Geoffrey Hinton, and Peter Thiel among them.

It competes with Nvidia's inference GPUs as well as other specialized challengers such as Cerebras and Groq, and with the custom in-house chips being built by Amazon, Google, Microsoft, and OpenAI.

Pricing

Sohu chipsCustom / enterprise
  • Direct hardware purchase
  • Volume-based pricing
Frontier inference clustersCustom / enterprise
  • Turnkey chip, rack, and software systems
  • Deployment and integration support

Etched sells to enterprises and AI infrastructure operators; there is no self-serve or consumer pricing. Access is arranged directly through the company.

  • Cerebras Inference — wafer-scale inference challenger with a different specialization approach
  • Groq Cloud — low-latency inference on a purpose-built LPU

Strengths

  • Purpose-built for the dominant workload — specializing for the transformer targets exactly where inference spending concentrates
  • Throughput and cost focus — the design goal is more transformer inference per dollar and per watt than a general-purpose GPU
  • Turnkey systems — frontier inference clusters let customers buy deployable capacity, not just chips
  • Strong backing and real demand — more than $1 billion booked and a top-tier investor and angel roster

Limitations and Considerations

  • Transformer-only — Sohu cannot run non-transformer models, so it is a bet on the architecture's continued dominance
  • Enterprise-only — no self-serve access; relevant to infrastructure operators, not individual developers
  • Young company — Etched is scaling from orders to at-volume delivery, and execution risk remains

Key Takeaways

  • Etched Sohu is an inference ASIC purpose-built for the transformer architecture, fabricated on TSMC's 4-nanometer process
  • Its thesis is that inference now dominates AI costs, so specializing hardware for the dominant architecture beats general-purpose flexibility
  • Etched has booked more than $1 billion in orders and is backed by Jane Street, Hudson River Trading, and angels including Andrej Karpathy and Geoffrey Hinton
  • The trade-off is flexibility: a transformer-only chip wins only as long as the transformer stays dominant

Save your progress & take the quiz

Sign up free to bookmark lessons, track which modules you've completed, and lock in what you learned with a quick knowledge-check quiz at the end of each lesson.

🧭Recommended for you