Learn about the HPC storage requirements to accelerate performance for production AI scenarios with distributed AI servers. This paper shows the testing results from a variety of benchmarks from 1 to 32 GPUs up to 4 server nodes using flash-based WekaIO storage. See how GPU performance compares within a single server versus a clustered configuration with the same amount of GPUs, as well as how GPU performance scales from 1 to 32 GPUs. Discover the storage bandwidth and throughput requirements for common benchmarks, such as Resnet50, VGG16, and Inceptionv4. The information in this paper can help you plan and optimize your AI resources for production AI.
This paper was created in partnership with HPE, NVIDIA, WekaIO and Mellanox.