Learning Objectives

Understand where AMD Instinct fits in the datacenter AI accelerator market and how it compares to NVIDIA's H200 and Blackwell generation
Identify the MI355X's headline specs and the newly launched MI400 family, including the Helios rack-scale system and the memory advantage that drives AMD's competitive pitch
Evaluate the major 2025-2026 customer wins (Oracle, Microsoft Azure, OpenAI, Meta, Anthropic) and what they say about AMD's structural position

What Is AMD Instinct?

AMD Instinct is AMD's datacenter AI accelerator line — the company's direct response to NVIDIA's H100, H200, and Blackwell GPUs. It is the only at-scale alternative shipping today to NVIDIA in the datacenter AI training and inference market, and Instinct deployments back the bulk of AMD's case to be the credible number-two AI silicon vendor.

The current shipping flagship is the MI355X (4th-generation CDNA architecture, generally available Q3 2025). First unveiled at CES 2026 and formally launched at AMD's Advancing AI event in July 2026, the MI400 series — the MI430X, MI440X, and MI455X — is built on a new architecture and aimed squarely at NVIDIA's Vera Rubin generation. Its centerpiece is Helios, a full rack-scale system built around the MI455X that begins shipping in the second half of 2026; until Helios reaches customers, MI355X is what hyperscalers actually buy.

💡Key Concept

CDNA versus RDNA: AMD splits its GPU architectures into two families. CDNA (Compute DNA) is the datacenter-only AI training and inference architecture used in Instinct. RDNA (Radeon DNA) is the gaming and graphics architecture used in Radeon consumer GPUs. CDNA strips out the graphics-only logic and adds matrix-engine acceleration, larger HBM stacks, and Infinity Fabric scaling — all the pieces datacenter AI workloads need.

MI355X — Current Shipping Flagship

The MI355X is built on a 3nm process and ships with 288 GB of HBM3e memory — about 50% more capacity than NVIDIA's Blackwell B200 (192 GB) and the headline number AMD leads with in customer pitches. Memory capacity matters because the largest models (frontier 400-billion-plus parameter dense models, 1-trillion-plus MoE models) routinely overflow per-GPU memory on H100 and H200, forcing slower model-parallel splits.

Spec	MI355X	NVIDIA B200
Architecture	CDNA 4 (3nm)	Blackwell
HBM memory	288 GB HBM3e	192 GB HBM3e
Memory bandwidth	8 TB/sec	8 TB/sec
FP16 throughput	2.3 petaFLOPS	~5 petaFLOPS
FP8 throughput	4.6 petaFLOPS	~10 petaFLOPS
FP4 throughput	9.2 petaFLOPS	~20 petaFLOPS
GA timing	Q3 2025	H1 2025

NVIDIA still leads on raw throughput at every precision tier; AMD's pitch is the memory ceiling plus aggressive pricing on rack-scale deployments.

MI400 Series and the Helios Rack

First previewed at CES 2026 and formally launched at AMD's Advancing AI event in July 2026, the MI400 family is three datacenter SKUs led by the MI455X, which claims 432 GB of HBM4, 320 billion transistors, and up to 40 petaFLOPS of FP4 per GPU — roughly 4 times the platform peak of the MI300X family.

The launch's real headline, though, is Helios — AMD's first full rack-scale system and its most direct answer yet to NVIDIA's rack-scale dominance. A single double-wide Helios rack packs 72 MI455X GPUs, 18 sixth-generation EPYC "Venice" processors, and AMD Pensando networking into one unit, delivering up to 2.9 exaflops of FP4 inference and 1.4 exaflops of FP8 training performance. Rather than sell only individual accelerators, AMD is now shipping the whole rack — the same way NVIDIA sells its GB-class systems — so frontier operators can buy training-and-inference capacity by the rack.

⚠️Warning

Launched, but not yet independently benchmarked. Helios and the MI455X are now formally shipping products, but the headline exaflops and memory numbers still come from AMD's own launch materials. Volume shipments begin in the second half of 2026. Treat the specs as vendor figures until Helios racks are measured in third-party hands.

Major 2025-2026 Customer Wins

The deployments below validate AMD as a serious second source rather than a bench warmer:

Oracle Cloud Infrastructure — General availability on MI355X via OCI, with single clusters of 130,000-plus MI355X GPUs (announced as the world's largest single-cluster Instinct deployment). Oracle has further committed to deploying 50,000 MI450 GPUs beginning Q3 2026.
Microsoft Azure — MI300X is already in production for select Azure AI inference workloads, and at the July 2026 launch Microsoft committed to deploy Helios on Azure to power frontier-model inference, with MI455X virtual machines arriving late 2026.
Anthropic — Named as a Helios launch customer, with plans to install up to two gigawatts of MI455X GPUs — one of the largest single commitments to AMD's rack-scale platform.
OpenAI — A signed multi-year supply deal in October 2025 for 6 gigawatts of AMD AI compute, with the first 1-gigawatt MI450 datacenter starting deployment in 2026.
Meta — Public commitment to MI350-class deployments for Llama-family training and inference, and named among the labs preparing Helios deployments.
xAI — Named as a Helios architecture customer for the MI400 era.

📝Note

The OpenAI deal is signed, not yet deployed. The 6-gigawatt headline number is the multi-year contracted capacity; actual silicon comes online in tranches starting 2026.

ROCm — The Software Side

Hardware does not run AI on its own. AMD's open software stack ROCm is what lets PyTorch, vLLM, and the broader open-source LLM ecosystem actually run on Instinct. The vLLM CI pass rate on Instinct jumped from 37 percent in November 2025 to 93 percent by January 2026 — the most-cited proof point that ROCm has closed the gap on NVIDIA's CUDA on the LLM-inference path. Strong ROCm availability is what makes Instinct a credible NVIDIA alternative rather than just a memory-ceiling differentiator.

Pricing

Plan	Price	Features
Cloud (OCI, Azure, others)	Per-GPU-hour	Pay-as-you-go via hyperscaler MI300X / MI355X No upfront commitment
Direct purchase	Enterprise quote	Volume server OEM channel Dell, HPE, Supermicro, Lenovo Multi-year support
Helios rack-scale	Enterprise quote	72 MI455X GPUs per rack via Helios Shipping H2 2026 For frontier-model operators

Cloud (OCI, Azure, others)Per-GPU-hour

Pay-as-you-go via hyperscaler
MI300X / MI355X
No upfront commitment

Direct purchaseEnterprise quote

Volume server OEM channel
Dell, HPE, Supermicro, Lenovo
Multi-year support

Helios rack-scaleEnterprise quote

72 MI455X GPUs per rack via Helios
Shipping H2 2026
For frontier-model operators

AMD does not publish list prices for Instinct accelerators. Pricing is set per deal — competitive pressure on NVIDIA pricing is widely reported as the reason hyperscalers cite for adding AMD as a second source.

Strengths

Memory capacity headline — 288 GB on MI355X versus 192 GB on B200; 432 GB on the newly launched MI455X. Frontier and trillion-parameter MoE models fit in fewer GPUs.
The memory pitch has independent third-party evidence — in July 2026 the inference company Wafer benchmarked Moonshot's 2.8 trillion parameter Kimi K3 and found the MI355X served 952 tokens per second per node against 1,568 on NVIDIA's B300. AMD loses on raw throughput but is roughly 2.4-times cheaper per GPU, which inverts the result on the metric buyers actually budget against: 48 tokens per second per dollar versus 33. The 288 GB per GPU is what does the work, letting the model fit in a single node where the NVIDIA configuration needs two.
Rack-scale system, not just chips — Helios lets AMD sell a complete 72-GPU rack against NVIDIA's rack-scale platforms, rather than competing accelerator-by-accelerator.
Real hyperscaler footprint — OCI, Azure, OpenAI, Meta, xAI, and now Anthropic deployments are all public and large.
Open software stack (ROCm) — Permissive licensing, native consumer-GPU support, and rapidly improving framework parity with CUDA on the inference path.
Cross-stack vendor leverage — Customers running EPYC plus Instinct plus Pensando NICs get a single-vendor hardware stack and matching support contracts.
ACE standards play — AMD co-authored the new x86 AI Compute Extensions (ACE) standard with Intel in April 2026, signaling cross-vendor cooperation on the CPU side that complements Instinct's GPU position.

Limitations and Considerations

Lower raw throughput than B200 — At every precision tier, NVIDIA Blackwell still wins on FLOPS. Memory advantage matters more for inference of very large models than for training throughput.
CUDA ecosystem gap — Hyperscaler-grade open-source LLMs run well on ROCm in 2026; the long tail of research code, custom kernels, and vendor-specific libraries still favors NVIDIA. Migrating an existing CUDA-heavy training stack is a real engineering project.
MI400 is launched but not yet independently benchmarked — Helios and the MI455X are formally shipping products, but volume arrives in the second half of 2026 and the headline exaflops figures are still AMD's own. Early buyers are committing partly on vendor numbers rather than third-party measurement.
Software toolchain maturity — Profilers, debuggers, and large-model training utilities (DeepSpeed, Megatron-LM) work, but the CUDA versions are typically the reference implementations.

Key Takeaways

AMD Instinct is the only credible at-scale NVIDIA alternative for datacenter AI training and inference — currently shipping MI355X with 288 GB of HBM3e memory, the highest per-GPU memory capacity in production
The MI400 series launched at AMD's July 2026 Advancing AI event inside Helios, AMD's first rack-scale system — 72 MI455X GPUs plus 18 EPYC "Venice" CPUs per rack, delivering up to 2.9 exaflops of FP4 inference and 1.4 exaflops of FP8 training, and shipping in the second half of 2026
Major 2025-2026 customer commitments — Oracle (130,000-plus GPU clusters and 50,000 MI450), Microsoft Azure (Helios deployment), Anthropic (up to two gigawatts of MI455X), OpenAI (6-gigawatt multi-year deal), Meta, and xAI — validate AMD as a structural second source rather than a niche alternative
AMD's pitch is a combination of memory-ceiling advantage on frontier models plus the rapidly maturing open ROCm software stack — vLLM CI pass rate on Instinct went from 37 percent to 93 percent across late 2025

AMD Instinct

Audio & video lessons are paid features