Learning Objectives
- Understand what makes Sohu different from a general-purpose GPU
- Explain why specializing a chip for the transformer architecture can improve inference speed and cost
- Identify where Etched fits among AI inference challengers to Nvidia
What Is Etched Sohu?
Etched Sohu is an AI inference chip built by Etched, an American semiconductor startup founded in 2022 by Harvard dropouts Gavin Uberti, Robert Wachen, and Chris Zhu. Where most AI chips — including Nvidia's GPUs — are general-purpose processors that can run any kind of model, Sohu makes a deliberate bet in the opposite direction: it is designed to run one thing extremely well.
That one thing is the transformer, the neural-network architecture behind virtually every modern large language model, from GPT and Claude to Gemini and Llama. Sohu is an application-specific integrated circuit (ASIC) — a chip whose logic is hard-wired for a fixed task rather than programmable for many. Etched hard-wires the matrix-multiplication patterns specific to transformer inference directly into silicon, fabricated on TSMC's 4-nanometer process.
💡Key Concept
Specialization versus flexibility: A GPU is a Swiss-army knife — it can train and run any model, but it spends transistors and energy on that flexibility. An ASIC like Sohu is a single-purpose tool: it can only run transformers, but because it does nothing else, far more of the chip is dedicated to the actual work. The bet is that the transformer has won decisively enough that specializing for it is worth giving up the flexibility.
Why a Transformer-Only Chip?
The economics of AI have shifted. Training a frontier model is a one-time cost; inference — actually running the model to answer queries — happens billions of times and now dominates the ongoing cost of operating AI at scale. That makes inference efficiency the battleground.
Etched's thesis is that once an architecture becomes as dominant as the transformer, the industry can afford to bake it into hardware. By removing the general-purpose overhead of a GPU, Sohu aims to deliver substantially more throughput per dollar and per watt on transformer inference specifically. Etched sells both the chips and full frontier inference clusters — turnkey systems that pair Sohu chips with custom racks and software so a customer can deploy inference capacity without assembling it piece by piece.
⚠️Warning
The risk of specialization: A transformer-only chip is a bet on the transformer staying dominant. If a fundamentally different architecture displaces it, a general-purpose GPU can adapt where a hard-wired ASIC cannot. Etched is wagering that the transformer's lead is durable enough to make that risk worth taking.
Traction and Backing
Etched has booked more than $1 billion in orders for its inference systems and is one of the most closely watched challengers in AI hardware. Its investors include quantitative-trading firms Jane Street, Hudson River Trading, and Two Sigma, along with Ribbit Capital, and its angel roster reads like a who's-who of AI: Andrej Karpathy, Geoffrey Hinton, and Peter Thiel among them.
It competes with Nvidia's inference GPUs as well as other specialized challengers such as Cerebras and Groq, and with the custom in-house chips being built by Amazon, Google, Microsoft, and OpenAI.
Pricing
- Direct hardware purchase
- Volume-based pricing
- Turnkey chip, rack, and software systems
- Deployment and integration support
Etched sells to enterprises and AI infrastructure operators; there is no self-serve or consumer pricing. Access is arranged directly through the company.
Related Tools
- Cerebras Inference — wafer-scale inference challenger with a different specialization approach
- Groq Cloud — low-latency inference on a purpose-built LPU
Strengths
- Purpose-built for the dominant workload — specializing for the transformer targets exactly where inference spending concentrates
- Throughput and cost focus — the design goal is more transformer inference per dollar and per watt than a general-purpose GPU
- Turnkey systems — frontier inference clusters let customers buy deployable capacity, not just chips
- Strong backing and real demand — more than $1 billion booked and a top-tier investor and angel roster
Limitations and Considerations
- Transformer-only — Sohu cannot run non-transformer models, so it is a bet on the architecture's continued dominance
- Enterprise-only — no self-serve access; relevant to infrastructure operators, not individual developers
- Young company — Etched is scaling from orders to at-volume delivery, and execution risk remains
Key Takeaways
- Etched Sohu is an inference ASIC purpose-built for the transformer architecture, fabricated on TSMC's 4-nanometer process
- Its thesis is that inference now dominates AI costs, so specializing hardware for the dominant architecture beats general-purpose flexibility
- Etched has booked more than $1 billion in orders and is backed by Jane Street, Hudson River Trading, and angels including Andrej Karpathy and Geoffrey Hinton
- The trade-off is flexibility: a transformer-only chip wins only as long as the transformer stays dominant