HBM (High Bandwidth Memory) Supply Chain

High Bandwidth Memory is the memory architecture that makes large-scale AI training possible at current GPU compute densities. By stacking multiple DRAM dies vertically, connecting them with through-silicon vias (TSVs), and integrating the stack directly onto the same package substrate as the GPU via TSMC's CoWoS 2.5D packaging, HBM delivers bandwidth that conventional DRAM interfaces cannot approach at acceptable power. An NVIDIA H100 SXM5 paired with six HBM3 stacks achieves approximately 3.35 TB/s of memory bandwidth — roughly 65× the bandwidth of a dual-channel DDR5 desktop system. Without HBM, the matrix multiplications that define transformer-based LLM training would be memory-bandwidth-starved regardless of GPU arithmetic throughput.

HBM is also the single most acute supply chain chokepoint in AI infrastructure. SK Hynix is the dominant HBM3/HBM3E supplier. TSMC CoWoS packaging capacity is required to integrate HBM with GPU dies. Both are booked against AI GPU demand years in advance. The convergence of DRAM stacking yield, packaging capacity, and a single dominant customer relationship (NVIDIA-SK Hynix) at the same bottleneck makes HBM supply chain risk uniquely concentrated even within a semiconductor landscape defined by concentration.

HBM Chip Families — Generations, Specs & GPU Deployment

Generation	Flagship products & suppliers	Bandwidth per stack	Stack height / capacity	I/O width	Key GPU / accelerator deployment	Status
HBM2E	SK Hynix HBM2E; Samsung HBM2E; Micron HBM2E	~460 GB/s	8-high; up to 16GB per stack	1,024-bit	NVIDIA A100 (40GB/80GB HBM2E); AMD MI250X; Intel Ponte Vecchio (Xe HPC); Google TPU v4	Legacy — still deployed in HPC, some inference; being replaced by HBM3 in new builds
HBM3	SK Hynix HBM3 (NVIDIA H100 primary supplier); Samsung HBM3; Micron HBM3	~819 GB/s	12-high; 24GB per stack	1,024-bit	NVIDIA H100 SXM5 (6 stacks × 24GB = 80GB); AMD MI300X (8 stacks × 24GB = 192GB); Google TPU v5p	Current mainstream for AI training; high-volume production at SK Hynix and Samsung
HBM3E	SK Hynix HBM3E (NVIDIA H200/B200 primary); Samsung HBM3E (qualifying); Micron HBM3E (third-source ramp)	~1.15–1.2 TB/s	12-high (36GB); 16-high in roadmap (48GB target)	1,024-bit	NVIDIA H200 (6 stacks × 36GB = 141GB); NVIDIA B100/B200 Blackwell (8 stacks target); AMD MI325X; AWS Trainium2	Ramping — primary constraint on Blackwell GPU shipment volumes; SK Hynix leads; Samsung and Micron qualifying
HBM4	SK Hynix HBM4 (sampling 2025–2026); Samsung HBM4 (development); Micron HBM4 (development)	~1.5–2.0 TB/s (projected)	16-high target; 64GB+ per stack target	2,048-bit (doubled vs HBM3)	NVIDIA Rubin (R100) target; AMD MI400 series (projected); Google TPU v6+	Development / early sampling; logic base die on TSMC node under evaluation; volume target 2026+

TSV Stack Architecture

A HBM stack begins with multiple DRAM dies fabricated on a standard DRAM process node (currently 1α/1β). Through-silicon vias — vertical electrical connections etched through the full thickness of each die — are formed and filled with copper. The dies are thinned to approximately 50 micrometers and stacked using thermal compression bonding with micro-bump interconnects at each layer interface. A logic base die at the bottom handles memory controller functions, PHY circuitry, power management, and ECC — and for HBM4, this base die may be fabricated on a leading-edge logic process at TSMC rather than on the same DRAM node as the memory dies, enabling higher controller performance and lower power.

Stack component	Spec / detail	Manufacturing challenge
DRAM core dies	1α/1β node; ~50µm die thickness after thinning; 1,024 TSV connections per die (HBM3); 2,048 per die (HBM4 target)	Die thinning to 50µm requires careful handling to avoid fracture; TSV formation yield; DRAM process integration with TSV process steps
Through-silicon vias (TSVs)	Pitch ~40–55µm; diameter ~5–6µm; copper-filled; ~1,024–2,048 TSVs per die depending on generation	TSV yield loss scales with stack height; a defective TSV in any layer can fail the entire stack; isolation and fill uniformity across all TSVs
Logic base die	HBM3/3E: fabricated on DRAM node by same IDM; HBM4: option for TSMC leading-edge logic node (N3 or similar) for higher PHY performance	HBM4 logic base die on a different process from memory dies introduces supply chain dependency on TSMC for a component previously within IDM vertical integration
Silicon interposer (CoWoS-S)	TSMC-fabricated; places HBM stacks adjacent to GPU die on shared silicon platform; H100 interposer ~800mm²; GB200 interposer larger still	CoWoS fabrication is TSMC-exclusive; capacity is a separate bottleneck from both DRAM wafer starts and logic wafer starts; interposer area scales with GPU die area

Vendor Supply Position — The SK Hynix-NVIDIA Lock

SK Hynix established HBM3 production leadership ahead of Samsung and Micron, qualifying its product with NVIDIA for the H100 and securing supply allocations that gave it a near-exclusive position for the most strategically important AI GPU generation. This was not accidental: SK Hynix began HBM investment earlier and accepted near-term margin compression to build TSV and stacking process capability before demand materialized at scale. The NVIDIA supply relationship that resulted mirrors the structural lock-in seen in other high-specificity supply chains — NVIDIA's H100/H200/B200 GPU packages are physically designed around SK Hynix HBM geometry, and changing HBM suppliers requires full re-qualification of the packaged product.

Samsung's HBM3E qualification delays with NVIDIA — reportedly related to thermal performance under sustained AI workload conditions — extended SK Hynix's supply window through the H200 and into the Blackwell generation. Samsung has since qualified HBM3E and is competing for B200-generation allocation. Micron is positioning HBM3E as a supply diversification source for hyperscalers, particularly those seeking US domestic fab alternatives.

Vendor	HBM3E position	HBM4 status	Key customers	Strategic risk
SK Hynix	Dominant; >50% of HBM3E output; primary NVIDIA H200/B200 supplier	Sampling 2025–2026; evaluating TSMC logic node for base die; targeting Rubin R100	NVIDIA (primary), AMD, Google TPU, AWS Trainium	NVIDIA single-customer concentration; Korea geographic concentration; HBM4 logic base die supply chain complexity if TSMC base die becomes standard
Samsung	Qualifying with NVIDIA post-delays; Samsung I-Cube packaging for own GPU customers (AMD primary)	Development in parallel with SK Hynix; Samsung has own foundry capacity as potential logic base die option	AMD MI300X/MI325X (primary HBM supplier); Google TPU; Microsoft Azure	NVIDIA qualification delay damage to competitive position; HBM capex competing with commodity DRAM recovery and foundry business investment
Micron	Third-source ramp; targeting hyperscaler diversification; smallest absolute output of the three	Development; US domestic fab footprint is differentiation for US government and CHIPS Act-aligned customers	Microsoft Azure, Amazon AWS (diversification purchases); US government-aligned hyperscalers	Smallest HBM share; dependent on hyperscaler diversification appetite rather than anchor customer like NVIDIA

CoWoS — The External Packaging Bottleneck

HBM production at the memory IDM is only half of the supply chain. Before an HBM stack can reach an AI GPU, it must be integrated with the GPU die on a common silicon interposer using TSMC's CoWoS process. CoWoS is a TSMC-exclusive advanced packaging technology; its capacity is entirely separate from TSMC's wafer fab capacity. A CoWoS line requires dedicated equipment and facility space, and expanding it follows the same multi-year lead time as any semiconductor manufacturing capacity addition.

During the 2023–2024 AI GPU demand surge, CoWoS became the binding supply constraint on NVIDIA GPU shipments — not TSMC N4 wafer starts, not SK Hynix HBM production, but the packaging step that joins them into a shippable product. TSMC has been aggressively expanding CoWoS capacity across CoWoS-S (silicon interposer), CoWoS-L (silicon-less RDL interposer), and CoWoS-R (hybrid) variants, but the expansion timeline means this remains a structural bottleneck through Blackwell and early Rubin GPU generations. Full coverage of CoWoS architecture, capacity, and variants is on the Advanced Packaging page.

HBM4 — Logic Base Die Transition

HBM4 introduces the possibility of fabricating the base die on a leading-edge logic process (TSMC N3 or similar) rather than on the same DRAM node as the memory dies. Using a logic node for the base die enables significantly more sophisticated memory controller logic, higher PHY speeds, and power management capabilities not achievable on a DRAM process. This is structurally significant: it would introduce TSMC as a supplier of a HBM component that was previously entirely within SK Hynix or Samsung vertical integration, adding a new external dependency and a new coordination requirement — base die wafer starts at TSMC must align with DRAM die production at the IDM and CoWoS packaging capacity at TSMC simultaneously.

Supply Chain Bottleneck Summary

Bottleneck	Mechanism	Severity
CoWoS packaging capacity (TSMC)	CoWoS is TSMC-exclusive; capacity booked years forward; the gating constraint on AI GPU shipments during 2023–2024; expanding but structural through Rubin generation	Critical
SK Hynix HBM3E concentration	SK Hynix >50% of HBM3E; NVIDIA primary supply lock; Samsung and Micron qualification lag means single-supplier risk persists through near-term GPU generations	High
TSV stacking yield at 16-high	Yield loss compounds at each stacking layer; 16-high HBM3E and HBM4 extend the stack height beyond the established 12-high yield learning curve	Medium — yield learning curve per generation; not an absolute barrier
HBM4 logic base die coordination	If TSMC logic node base die becomes standard for HBM4, three separate supply chains (DRAM dies at IDM, base die at TSMC, CoWoS at TSMC) must synchronize for each production run	Medium — new coordination risk not present in HBM3/3E
Korea geographic concentration	SK Hynix Icheon/Cheongju and Samsung Pyeongtaek are the dominant HBM production sites; both in Korea with no geographic redundancy	Structural

Related Coverage

Cross-Network — ElectronsX Demand Side

Every HBM-equipped GPU shipped into a hyperscaler data center adds to the electrical load reshaping grid infrastructure planning, transformer procurement, and utility interconnection queues. The AI training cluster buildout driving HBM demand is the same infrastructure expansion driving data center power density and cooling demand covered across ElectronsX.

EX: ADAS/AV Compute Architecture | EX: Humanoid Robots | EX: Supply Chain Convergence Map

HBM — High Bandwidth Memory Chips