Name: Etched Sohu
Availability: InStock
Author: Etched

Learning Objectives

Understand what makes Sohu different from a general-purpose GPU
Explain why specializing a chip for the transformer architecture can improve inference speed and cost
Identify where Etched fits among AI inference challengers to Nvidia

What Is Etched Sohu?

Etched Sohu is an AI inference chip built by Etched, an American semiconductor startup founded in 2022 by Harvard dropouts Gavin Uberti, Robert Wachen, and Chris Zhu. Where most AI chips — including Nvidia's GPUs — are general-purpose processors that can run any kind of model, Sohu makes a deliberate bet in the opposite direction: it is designed to run one thing extremely well.

That one thing is the transformer, the neural-network architecture behind virtually every modern large language model, from GPT and Claude to Gemini and Llama. Sohu is an application-specific integrated circuit (ASIC) — a chip whose logic is hard-wired for a fixed task rather than programmable for many. Etched hard-wires the matrix-multiplication patterns specific to transformer inference directly into silicon, fabricated on TSMC's 4-nanometer process.

💡Key Concept

Specialization versus flexibility: A GPU is a Swiss-army knife — it can train and run any model, but it spends transistors and energy on that flexibility. An ASIC like Sohu is a single-purpose tool: it can only run transformers, but because it does nothing else, far more of the chip is dedicated to the actual work. The bet is that the transformer has won decisively enough that specializing for it is worth giving up the flexibility.

Why a Transformer-Only Chip?

The economics of AI have shifted. Training a frontier model is a one-time cost; inference — actually running the model to answer queries — happens billions of times and now dominates the ongoing cost of operating AI at scale. That makes inference efficiency the battleground.

Etched's thesis is that once an architecture becomes as dominant as the transformer, the industry can afford to bake it into hardware. By removing the general-purpose overhead of a GPU, Sohu aims to deliver substantially more throughput per dollar and per watt on transformer inference specifically. Etched sells both the chips and full frontier inference clusters — turnkey systems that pair Sohu chips with custom racks and software so a customer can deploy inference capacity without assembling it piece by piece.

⚠️Warning

The risk of specialization: A transformer-only chip is a bet on the transformer staying dominant. If a fundamentally different architecture displaces it, a general-purpose GPU can adapt where a hard-wired ASIC cannot. Etched is wagering that the transformer's lead is durable enough to make that risk worth taking.

Traction and Backing

Etched has booked more than $1 billion in orders for its inference systems and is one of the most closely watched challengers in AI hardware. Its investors include quantitative-trading firms Jane Street, Hudson River Trading, and Two Sigma, along with Ribbit Capital, and its angel roster reads like a who's-who of AI: Andrej Karpathy, Geoffrey Hinton, and Peter Thiel among them.

It competes with Nvidia's inference GPUs as well as other specialized challengers such as Cerebras and Groq, and with the custom in-house chips being built by Amazon, Google, Microsoft, and OpenAI.

Pricing

Plan	Price	Features
Sohu chips	Custom / enterprise	Direct hardware purchase Volume-based pricing
Frontier inference clusters	Custom / enterprise	Turnkey chip, rack, and software systems Deployment and integration support

Sohu chipsCustom / enterprise

Direct hardware purchase
Volume-based pricing

Frontier inference clustersCustom / enterprise

Turnkey chip, rack, and software systems
Deployment and integration support

Etched sells to enterprises and AI infrastructure operators; there is no self-serve or consumer pricing. Access is arranged directly through the company.

Cerebras Inference — wafer-scale inference challenger with a different specialization approach
Groq Cloud — low-latency inference on a purpose-built LPU

Strengths

Purpose-built for the dominant workload — specializing for the transformer targets exactly where inference spending concentrates
Throughput and cost focus — the design goal is more transformer inference per dollar and per watt than a general-purpose GPU
Turnkey systems — frontier inference clusters let customers buy deployable capacity, not just chips
Strong backing and real demand — more than $1 billion booked and a top-tier investor and angel roster

Limitations and Considerations

Transformer-only — Sohu cannot run non-transformer models, so it is a bet on the architecture's continued dominance
Enterprise-only — no self-serve access; relevant to infrastructure operators, not individual developers
Young company — Etched is scaling from orders to at-volume delivery, and execution risk remains

Key Takeaways

Etched Sohu is an inference ASIC purpose-built for the transformer architecture, fabricated on TSMC's 4-nanometer process
Its thesis is that inference now dominates AI costs, so specializing hardware for the dominant architecture beats general-purpose flexibility
Etched has booked more than $1 billion in orders and is backed by Jane Street, Hudson River Trading, and angels including Andrej Karpathy and Geoffrey Hinton
The trade-off is flexibility: a transformer-only chip wins only as long as the transformer stays dominant

Etched Sohu

Audio & video lessons are paid features

Learning Objectives

What Is Etched Sohu?

Why a Transformer-Only Chip?

Traction and Backing

Pricing

Strengths

Limitations and Considerations

Key Takeaways

Save your progress & take the quiz

Audio & video lessons are paid features

Learning Objectives

What Is Etched Sohu?

Why a Transformer-Only Chip?

Traction and Backing

Pricing

Related Tools

Strengths

Limitations and Considerations

Key Takeaways

Save your progress & take the quiz