SemiconductorX > Fab & Assembly > Manufacturing Flow > Module Integration > CPU/GPU Boards



GPU & CPU Subsystem Boards



Subsystem boards are the integration layer where packaged CPUs, GPUs, and AI accelerators combine with memory, power delivery, and high-speed interconnects on a multi-layer printed circuit board to form a functional compute engine. The board is what actually plugs into a server, a robot chassis, or a vehicle. It is the assembled product that the downstream system integrator receives and connects into racks, enclosures, or platforms. Subsystem boards are structurally different from multi-chip modules because they operate at package-to-package scale on a PCB rather than die-to-die scale on a substrate. They are structurally different from finished server boards because they present a single coherent compute product — the "SXM module" or "OAM module" or "PCIe accelerator" — that later drops into a system motherboard.

The subsystem board layer is where AI accelerator product definitions happen. NVIDIA's H100 and H200 are not sold as bare chips; they are sold as SXM modules or PCIe cards that include the GPU, HBM, voltage regulators, and NVLink connectors. AMD's MI300 ships as OAM modules. Tesla's Dojo training tile is a subsystem board. The silicon is only the starting point; the subsystem board is what the customer buys and what the competitive architecture decisions are made about. Concentration at this layer tracks silicon concentration because the same three or four companies (NVIDIA, AMD, Intel, with Tesla in custom AI training) that define the silicon also define the boards. Contract manufacturers and ODMs (Foxconn, Quanta, Wistron, Inventec) do the physical build, but the architecture is owned by the silicon company.


What a Subsystem Board Contains

Component ClassFunctionScale at Top-End AI Boards
Compute packageCPU, GPU, or AI accelerator — often itself an MCM with advanced packaging700 W to 1000+ W TDP per package
MemoryHBM integrated on the compute package; or external memory modules on the same boardMultiple HBM stacks at 1+ TB/s per stack
Voltage regulator modules (VRMs)Deliver hundreds of amperes of low-voltage current to the compute package1000+ A total; sub-millivolt regulation
Passives and clock managementDecoupling capacitors, inductors, clock generation, PLLsThousands of discrete passives per board
High-speed interconnectsPCIe Gen5/Gen6, NVLink, Infinity Fabric, CXL, OAM interconnectMulti-terabit aggregate bandwidth per board
Thermal solutionHeat spreader, vapor chamber, or direct-to-die liquid cooling plateKilowatt-scale heat removal; liquid cooling standard for top-end AI
PCBMulti-layer board with high-speed signal routing and current-carrying planes20+ layers with specialty materials (low-loss laminates, buried capacitance)

Board Form Factors

Subsystem boards have standardized form factors that define mechanical dimensions, connector interfaces, and power/thermal interfaces. Form factor choice determines which systems a board can plug into and how it is cooled.

Form FactorOrigin / GovernanceTypical Use
NVIDIA SXMNVIDIA proprietaryNVIDIA data center GPUs (SXM5 for H100/H200, SXM-Next for Blackwell); up to 8 modules per server with NVLink
OCP Accelerator Module (OAM)Open Compute Project standardAMD Instinct MI300, Intel Gaudi, other non-NVIDIA accelerators; vendor-neutral alternative to SXM
PCIe accelerator cardPCI-SIG standard, various lengths (full-length, full-height)Lower-TDP inference GPUs, networking accelerators, FPGAs; broader server compatibility
UBB (Universal Baseboard)OCP standard baseboard for OAM8-OAM baseboards that form the accelerator complex in datacenter servers
HGX / MGXNVIDIA reference designsMulti-GPU baseboards for NVIDIA SXM modules (HGX); modular rack-level reference designs (MGX)
Custom AI training tilesVendor-specific (Tesla Dojo, Cerebras wafer-scale)Custom AI training systems; tightly coupled silicon-and-system co-design

Representative Boards

BoardOwnerComposition
NVIDIA H100 / H200 SXMNVIDIAHopper GPU + HBM3/HBM3E stacks on CoWoS + VRMs + NVLink; 700W TDP
NVIDIA Blackwell SXMNVIDIADual-die Blackwell GPU + HBM3E on CoWoS-L + higher-bandwidth NVLink; ~1000W TDP
NVIDIA GB200 SuperchipNVIDIAGrace CPU + 2× Blackwell GPU on one module; NVLink-C2C between CPU and GPUs
AMD Instinct MI300X OAMAMD3D-stacked CPU + GPU chiplets + HBM3; OAM form factor for 8-module servers
Intel Gaudi 3 OAMIntel (Habana Labs)Custom AI accelerator + HBM; targeting NVIDIA H100 competition at lower cost per FLOP
Intel Ponte Vecchio / Max GPUIntelTile-based GPU using Foveros and EMIB + HBM; Aurora supercomputer
Tesla Dojo Training TileTesla25 custom D1 AI dies in a tile with integrated liquid cooling; 9 petaflops per tile
Cerebras Wafer-Scale EngineCerebrasSingle wafer-scale AI chip (largest chip ever built at each generation); custom cooling and power delivery

Power & Thermal: The Limiting Factor

The primary engineering constraint on modern AI and HPC subsystem boards is not silicon capability; it is power and thermal density. A top-end AI board draws over 1000 watts and dissipates it over a footprint smaller than a laptop. Three engineering challenges compound.

First, power delivery: delivering 1000+ amps at sub-one-volt regulation requires a voltage regulator module (VRM) complex that itself consumes board area and generates substantial heat. VRM efficiency matters — every percent of inefficiency is tens of watts of additional heat to remove. VRM innovation (vertical power delivery, integrated voltage regulators on the compute package, multi-phase power management) is an active engineering frontier and increasingly a competitive differentiator. Specialty suppliers for high-end VRMs include Infineon, MPS (Monolithic Power Systems), Renesas (Intersil), Alpha and Omega, and Vicor, with growing integration of power management ICs into the accelerator package itself.

Second, signal integrity: high-speed SerDes signaling at 100 Gbps per lane and above requires specialty PCB materials (low-loss laminates like Megtron 7/8, buried capacitance layers), careful stackup design, and sometimes re-timers or signal conditioners on the board. At the bandwidths modern AI boards require, a few millimeters of PCB trace routing choice can determine whether the link works reliably.

Third, cooling: conventional air cooling is near its limit at 700W per package and unworkable at 1000W. Direct-to-die liquid cooling, cold plates integrated into board design, and rack-level liquid distribution are becoming standard for top-end AI deployments. Immersion cooling, long a niche technology, is in deployment at multiple hyperscaler sites. Board design increasingly assumes liquid cooling rather than treating it as an upgrade option. Cooling integration pushes the board-level design further into system co-design — the board cannot be cooled in isolation from the chassis and the rack-level cooling infrastructure.


Interconnects Beyond the Board

Subsystem boards must connect into larger topologies — multiple boards in a server, multiple servers in a rack, multiple racks in a cluster. The board-level interconnect determines what topologies are achievable.

InterconnectRoleTypical Reach
PCIe Gen5 / Gen6General-purpose host-to-accelerator and accelerator-to-storageWithin-server; host CPU to accelerator, accelerator to storage and networking
NVLink / NVLink SwitchNVIDIA-proprietary high-bandwidth GPU-to-GPU fabricWithin-server (up to 8 or 18 GPUs); NVLink Switch extends to rack-scale
Infinity FabricAMD-proprietary on-module, socket-to-socket, and accelerator-to-accelerator fabricWithin-module and within-server
CXL (Compute Express Link)Coherent memory and accelerator interconnect over PCIe PHYWithin-server; emerging rack-scale with CXL 3.0 switches
InfiniBand / Ethernet (RoCE, Ultra Ethernet)Rack-to-rack and cluster-scale networkingMulti-rack; cluster-scale AI fabrics (NVIDIA Spectrum-X, Ultra Ethernet Consortium)

Beyond the Datacenter

Subsystem boards are not a datacenter-only category. The same integration layer — package + memory + power + thermal + interconnect on a PCB — appears in several other domains, with different constraints and different operators building the boards.

DomainRepresentative BoardsPrimary Constraints
Automotive inferenceTesla HW4 / AI5 FSD computer, NVIDIA Drive Thor, Mobileye EyeQ modules, Qualcomm Ride platformAutomotive functional safety (ISO 26262), thermal (no liquid cooling), lifetime reliability, harsh environment
Humanoid & robotic controlTesla Optimus compute board, Figure 02 compute, NVIDIA Jetson Thor-based robot brainsWeight, battery power, real-time inference latency, vibration tolerance
Drone perceptionNVIDIA Jetson Orin/Thor-based drone compute, custom defense UAV boardsWeight, power, ruggedization; wireless comm integration
Industrial edge & IIoTIndustrial AI gateways; edge inference boxes; factory floor computeExtended temperature range, long product lifetime, industrial bus integration

The cross-pillar relevance of subsystem boards reaches beyond SemiconductorX: DatacentersX covers the rack-to-cluster scale-out; ElectronsX covers the vehicle and robotics applications where automotive and robotics inference boards live as the compute brain; 5IREnterprise covers the industrial and energy-grid applications. Subsystem boards are the convergence point that connects silicon-level manufacturing to real-world intelligent systems.


Contract Manufacturing

The physical assembly of subsystem boards is largely done by contract manufacturers and ODMs rather than the silicon companies themselves. NVIDIA, AMD, and Intel define the board architecture, own the firmware, and sell under their brand; Foxconn (Hon Hai), Quanta, Wistron, Inventec, Pegatron, and Compal do the build. The high-complexity boards (NVIDIA SXM with advanced packaging integration, custom AI training tiles) often run through captive lines or select premier partners; higher-volume lower-complexity boards distribute across a broader ODM base. Taiwan dominates the contract manufacturing layer, with substantial overflow capacity in Mexico, Thailand, and Vietnam. This is the same ODM landscape that builds datacenter servers, smartphones, and consumer electronics.


Related Coverage

Parent: Module Integration

Sibling modules: Multi-Chip Modules (MCMs) · Memory Modules

Compute silicon consumers: AI Accelerators · GPUs · HBM

Cross-pillar relevance: Automotive & ADAS Chips · Humanoid Robot Semiconductors · Bottleneck Atlas