How Certified Storage is Reshaping the AI Factory Floor


From DGX to GB200 and Beyond
AI factories are evolving at an unprecedented pace. What was once considered cutting-edge, like the NVIDIA DGX SuperPOD™ with NVIDIA H100 Tensor Core GPUs, has already given way to a new class of infrastructure powered by the NVIDIA Grace Blackwell Superchip and NVIDIA HGX™ systems with NVIDIA H200, B200, and B100 GPUs. These aren’t just more powerful servers; they represent a shift toward composable, rack-scale AI architectures designed to process trillions of tokens, power intelligent agents, and accelerate generative AI adoption across the enterprise.
But here’s the truth: No matter how fast your GPUs are, if your storage can’t keep up, your AI factory grinds to a halt.
That’s why WEKA’s certification for two new NVIDIA Cloud Partner (NCP) High-Performance Storage (HPS) reference architectures—GB200 NVL72 and HGX H100/H200/B200 systems—is a big deal.
From Proof-of-Concept to Production-Grade AI Factories
In the early days, AI infrastructure was a bit artisanal. Teams cobbled together a few DGX nodes, some fast local NVMe, and crossed their fingers. It was good enough for proof-of-concept training and benchmark runs—but nowhere near the scale needed to fuel the AI wave we’re in now.
Today, that artisanal approach has given way to something industrial.
Massive clusters. Token-scale pipelines. Petabytes of training data. Millisecond-latency demands. And zero tolerance for bottlenecks.
NVIDIA’s Cloud Partner program and the NCP Reference Architectures reflect this shift—raising the bar for what it takes to run AI at scale. And WEKA’s certified architectures provide the data layer to match.
The New Bar for Performance: 1 GB/s Per GPU
Why is 1 GB/s per GPU such a big deal?
Because it’s not just a benchmark—it’s a design principle. A single Blackwell GPU can process and move enormous amounts of data. Starve it of input or slow down output, and you’re wasting not just money, but energy, time, and opportunity.
WEKA delivers one of the few storage systems that can consistently deliver 1.0 GB/s or more per GPU, even at massive scales that are demanded by today’s GPUs:
- 1,152 Blackwell GPUs (1 SU) → 1.2 TB/s read throughput
- 4,608 GPUs → 4.6 TB/s read throughput
- 18,432 GPUs → 18.4 TB/s read throughput and 9.2 TB/s write throughput
That performance is made possible by NeuralMesh™ built on a containerized microservices architecture based on service-oriented design that uses NVMe, a user-space DPDK networking stack, and a parallel file system with virtually distributed metadata. The result? Lightning-fast performance at microsecond latencies, whether you’re slinging random 4KB files or giant multimodal checkpoints.
Designed for the New Era of AI
WEKA’s storage reference architectures aren’t just validated—they’re precision-tuned for the demands of modern AI workloads:
- Composable storage clusters that scale linearly with compute, from a few HGX boxes to 18,000+ Blackwell GPUs
- Multitenancy with physical isolation enabling cloud providers to deliver AI-as-a-Service without compromise
- Certified configurations that deploy easily, for both GB200 NVL72 and HGX H100/H200/B200 platforms, including cabling, rack units, and thermal guidance
The new AI factory isn’t a monolith. It’s a dynamic, software-defined system that needs performance, flexibility, and rock-solid reliability at every layer. That includes the data layer.
High-Performance NVMe with Micron 9550 SSD
An essential element of delivering this level of performance is the right NVMe storage. The Micron 9550 SSD (PCIe Gen5) offers exceptional performance and power efficiency, making it a strong fit for NeuralMesh deployments with NVIDIA Cloud Partners (NCPs) and other large-scale AI environments. Its outstanding sequential and random read/write speeds ensure rapid data movement to and from GPUs, maximizing utilization and throughput across complex AI workloads.
The Micron 9550’s power efficiency also helps reduce energy consumption and operating costs—critical advantages for hyperscale AI factories. Further validating its role in next-generation AI infrastructure, the Micron 9550 SSD has been qualified by NVIDIA for its Recommended Vendor List (RVL) for local storage on GB200 NVL72 systems.
Why This Matters
If you’re building AI infrastructure today, the playbook is changing fast. GB200 isn’t just another generation, it’s the start of a new era of unified memory, CPU-GPU fabric, and hyperscale inference. But none of that matters if your data can’t keep up.
With WEKA’s NCP-certified storage solutions, you can:
- Keep your GPUs fed with ultra-low-latency data access
- Hit and sustain 1 GB/s per GPU performance
- Scale from hundreds to tens of thousands of GPUs
- Do it all with a single, high-performance storage system purpose-built for AI.
Welcome to the new AI factory floor. It’s faster, smarter, and now—certifiably better.
Ready to See What Certified Performance Looks Like? Dive into the full reference architectures for the GB200 NVL72 and HGX H100/H200/B200 systems to see how WEKA delivers AI-native storage at scale, validated by NVIDIA.