NVIDIA’s push into next-generation AI infrastructure is set to reshape the NAND storage market in ways that could test the limits of current supply chains. At CES 2026, the tech giant unveiled its Vera Rubin AI systems, which rely on a novel architecture designed to handle the explosive growth in data demands for agentic AI workloads. Unlike previous generations, these systems will offload context-building operations—previously handled by high-bandwidth memory (HBM)—to external SSD storage via NVIDIA’s Inference Memory Context Storage (ICMS).

This shift marks a departure from the status quo, where temporary data logs (KV caches) were stored on-board. With Vera Rubin, NVIDIA is effectively moving from a model constrained by HBM limits to one that requires massive SSD capacity per GPU. Industry estimates suggest a single NVL72 configuration could demand 1,152 terabytes of NAND, creating a storage footprint that dwarfs typical data center deployments.

If Vera Rubin shipments reach projections—30,000 units in 2026 and scaling to 100,000 by 2027—the cumulative NAND demand could surge past 115 million terabytes. This would represent nearly 9.3% of the global NAND market’s projected output over the same period, a figure that industry analysts are only beginning to factor into their supply models. The potential for such a rapid increase in demand raises concerns about whether the NAND ecosystem can adapt without triggering shortages reminiscent of the DRAM crunch seen in recent years.

The implications extend beyond data centers. With NVIDIA positioning agentic AI as a cornerstone of its roadmap, the need for scalable storage solutions is becoming more urgent. Current NAND production lines, already under pressure from data center expansion and inference workloads, may struggle to meet this new wave of requirements. For consumers, the ripple effects could be felt in SSD availability and pricing, though the primary impact will likely be on enterprise-grade storage solutions.

TSUBAME 3.0 PA075079

NVIDIA’s investment in this direction is substantial. Reports indicate the company has allocated $100 billion toward AI infrastructure development, signaling a level of commitment that few competitors can match. The Vera Rubin platform, still in its early stages, promises to redefine how AI systems process and retain context—yet the storage demands it introduces may force the industry to accelerate innovation in NAND production or risk falling behind.

  • Storage Demand (Estimated)
  • 1,152 TB per NVL72 configuration
  • 30,000 units projected for 2026
  • 100,000 units projected for 2027
  • Cumulative NAND demand: ~115 million TB (9.3% of global market)

The shift to ICMS represents a fundamental change in how AI systems are architected. Traditional HBM solutions, though faster, are limited by physical constraints—size, power consumption, and cost per gigabyte. SSDs, while slower for random access, offer near-limitless capacity at a fraction of the cost. This trade-off allows Vera Rubin to handle larger datasets without sacrificing performance, but it also means that storage bottlenecks could become a defining challenge for AI development in the coming years.

Whether the NAND industry can scale in time remains an open question. If history is any guide, the transition to new technologies often outpaces supply chain adjustments, leading to temporary shortages. For NVIDIA and its partners, navigating this landscape will be critical—not just for meeting their own roadmap targets, but for ensuring that the AI revolution doesn’t stall before it begins.