HBM Memory Leaps Forward: HBM5 and HBM6 Development Accelerates with N

The foundation for next-gen AI hardware is being laid now, and it starts with memory. While NVIDIA and AMD prepare to ship their first HBM4-based accelerators this year, the industry is already racing toward HBM5 and HBM6—standards that promise double the bandwidth, denser stacks, and power efficiency breakthroughs. A key milestone in this transition has emerged: Hanmi, a semiconductor equipment manufacturer, is readying its first Wide TC Bonders, a critical tool for mass-producing these advanced memory stacks.

The Wide TC Bonder replaces the troubled Hybrid Bonder (HB), which faced delays due to technical hurdles. Unlike its predecessor, this new system boosts production yields for HBM4, HBM4E, HBM5, and HBM6 while improving bond strength through fluxless bonding—a process that reduces oxide layers on chip surfaces, thinning stacks and enhancing reliability.

From HBM4 to HBM5: A Steady Climb in Performance

HBM5, slated for deployment around 2029, builds on HBM4’s foundation with incremental but meaningful upgrades. The standard maintains an 8 Gbps data rate for its base variant but expands I/O lanes to 4096, unlocking 4 TB/s bandwidth per stack. With 16-high (16-Hi) stacks as the baseline and 40 Gb DRAM dies, a single HBM5 module can reach 80 GB capacity—a 50% jump over HBM4’s 56 GB per stack. Power consumption per stack is capped at 100W, aligning with immersion cooling and thermal via (TTV) designs.

Key features of HBM5 include

Data Rate: 8 Gbps (Non-e variant)
I/O Count: 4096
Bandwidth: 4.0 TB/s per stack
Stack Height: 16-Hi
Die Capacity: 40 Gb
Module Capacity: 80 GB
Power: 100W per stack
Packaging: Microbump (MR-MUF)
Cooling: Immersion cooling, thermal via (TTV), thermal bonding
Innovations: Dedicated decoupling capacitor die stack, 3D NMC-HBM, stacked cache with LPDDR+CXL in base die

This leap positions HBM5 as the backbone for NVIDIA’s Feynman architecture and AMD’s Instinct MI500 series, expected to arrive in the late 2020s.

HBM Memory Leaps Forward: HBM5 and HBM6 Development Accelerates with New Bonding Tech

HBM6: The AI Memory Horizon—Double Bandwidth, Higher Stacks

Looking further ahead, HBM6 pushes boundaries with 16 Gbps data rates, doubling bandwidth to 8 TB/s per stack. The standard introduces 20-high stacks, enabling 96–120 GB per module—a 30% increase over HBM5—while maintaining 120W per stack. For the first time, HBM6 adopts bumpless Cu-Cu direct bonding, eliminating traditional microbumps to improve signal integrity and reduce thickness.

Beyond raw specs, HBM6 explores active/hybrid interposers (silicon+glass), onboard network switches, and bridge dies—features that could redefine how memory interfaces with accelerators. Cooling remains immersion-based, with multi-tower designs to handle the higher power densities.

Key features of HBM6 include

Data Rate: 16 Gbps
I/O Count: 4096
Bandwidth: 8.0 TB/s per stack
Stack Height: 16/20-Hi
Die Capacity: 48 Gb
Module Capacity: 96–120 GB
Power: 120W per stack
Packaging: Bump-less Cu-Cu direct bonding
Cooling: Immersion cooling, multi-tower HBM
Architectural Innovations: Active/Hybrid interposer, network switch, bridge die, asymmetric TSV

These advancements reflect a shift toward asymmetric memory architectures, where base dies integrate LPDDR and CXL interfaces, blurring the lines between traditional HBM and system memory.

A Timeline Shaped by Memory

While HBM4 dominates 2026 with its 11.7 Gbps speeds and 56 GB stacks, the groundwork for HBM5 and HBM6 is already underway. The Wide TC Bonder’s debut at Semicon Korea 2026 signals a critical step in scaling production, ensuring these next-gen standards can meet the demands of AI accelerators like NVIDIA’s Feynman and beyond. The transition from HBM4 to HBM5 will mark a 2x bandwidth increase, while HBM6 will push the envelope further—setting the stage for exascale computing and beyond.

The race for AI dominance hinges on memory innovation. With HBM5 and HBM6 on the horizon, the stage is set for a new era of computational power—one where bandwidth, capacity, and efficiency redefine what’s possible.