Today, we are excited to announce a new set of SPECStorage™ Solution 2020 benchmarks showcasing the WEKA® Data Platform’s exceptional performance capabilities in hyperscale cloud environments.

As noted in a previous blog, WEKA’s philosophy is to be as transparent as possible. We use highly structured and audited benchmarks whenever possible and include detailed configuration information when providing internal testing results so customers can expect repeatable results if they deploy the same configurations. SPEC benchmarks provide the details to allow this repeatability.

Unprecedented Performance: The Record-Breaking Benchmark Results

First up, a new SPEC_ai_image result focused on value. A recent vendor submission in this category using a cloud hyperscaler showed a result with a load point of 400 jobs with an Overall Response Time (ORT) of 1.22ms. WEKA produced 700 Jobs at an ORT of .85ms.

Using the pricing calculators for Amazon Web Services and Azure, we also discovered that WEKA’s infrastructure cost was significantly less than that of the benchmarked competitor.

BenchmarkInfrastructureResults
SPEC_ai_imageAzureWEKA running in Microsoft Azure did 175% the number of AI_jobs at 64% of the infrastructure cost, resulting in a 2.7x lower cost per job based on infrastructure costs alone  

While cost efficiency is a valuable measure, sometimes outright speed is needed. WEKA then deployed a larger cluster with zero tuning changes between the different benchmarks to showcase lightning performance capabilities across all IO profiles. The results speak for themselves.

BenchmarkInfrastructureResults
SPEC_ai_imageAWS#1 position

WEKA delivers 6x higher load count (2400) than another cloud based benchmark at only 3/4 the cost per job. Net: Big or small, in multiple clouds, WEKA is faster and has a better cost-per-job.  
SPEC_vdaAWS#1  position

WEKA still holds the #1 spot from two years ago with a compact on-prem system (8000 streams). We beat it with 12000 streams in the cloud. Net: On-prem or in the cloud, WEKA is the highest performing video benchmark  
SPEC_genomicsAWS#1 position

Took the #1 spot from a small, specialist vendor in this benchmark with 2200 jobs achieved.  
SPEC_eda_blendedAWS#1 position

WEKA delivered 6310 jobs at 0.87ms ORT. WEKA delivered a 60% lower ORT in the cloud vs. an on-prem NVMe based all-flash system.
SPEC_sw_buildAWS#2 position
WEKA achieved 3500 builds with an ORT of 0.74ms compared to an on-prem NVMe based all-flash system that achieved 6120 builds at 1.58ms ORT. WEKA notably maintained sub-.9ms latency at all load points.

What makes these results so interesting isn’t just that WEKA is either the #1 or #2 result in all the SPEC 2020 benchmarks – it’s the impact of being able to handle any IO profile with zero tuning changes between benchmarks. Read, write, metadata-intensive, IOP-driven and throughput-driven IO are all handled equally efficiently. For customers, this advantage in handling any IO profile shows up as real wall-clock time reductions in time-to-completion of data pipelines. It enables AI workloads to checkpoint often without stalling model training and increases utilization rates of GPUs attached to WEKA.

Conclusions:
WEKA is incredibly performant whether in any cloud or on-prem. With the hyperscalers continually improving the capabilities of their back-end infrastructure, the public cloud can now provide comparable results to on-prem for storage. This gives customers the confidence to deploy their applications and workflows in the infrastructure that best meets their needs without compromising performance.

Everyone wins when the WEKA Data Platform’s performance breaks records. It inspires companies to challenge themselves to try and ‘up their game’ while removing legacy constraints that have bottlenecked the most performant AI workloads. Technologies get upgraded, organizations have more options, and big problems have the potential to get solved faster than they ever have before. But we’re not just talking about infrastructure solutions. At WEKA, we’re committed to eliminating the compromises of the past by redefining modern storage performance in the cloud and AI era. Watch this space as we keep breaking records while building the AI-native data platform of the future. Here’s to breaking down barriers and shattering records for many years.

UPDATE: In a prior version of this blog, we went beyond what is allowed in the SPECstorage™ 2020 reporting rules. For SPECstorage 2020, the only items to have a comparison are the load point value and ORT. Anything else, such as applying latency as an extrapolation, is an estimation and should not be used. SPEC specifically calls this out at https://www.spec.org/fairuse.html. With our sincere apologies, we have updated the blog on the recent WEKA results.

Get Started