The Fast and The Spurious
Andy Watson. November 18, 2019
Andy Watson, Chief Technology Officer at WekaIO, shares his thoughts on Panasas and its performance claims in this blog titled “The Fast and The Spurious”.
Recently Panasas, a decades-old file system company, caught our attention at WekaIO by claiming to have the “fastest parallel file system at any price point.” Their supporting materials for the newly-fledged PanFS™ provided comparisons with BeeGFS, IBM ESS, Lustre, and their own 20-year-old Panasas ActiveStor.
Hey — now, wait just a minute! At WekaIO, we happen to have the world’s fastest file system. And yes, it’s a parallel file system, too. I could throw a few other adjectives around to describe its wonderfulness, but my point here is that Panasas was out of line to: (1) claim to have the fastest file system; and (2) conveniently leave us off that list.
Panasas ought to know better. Unlike their approach to making grand claims, Weka always documents our performance leadership with industry-standard benchmarks. For example, we’ve published many results carefully reviewed by SPEC, STAC, and the Virtual Institute for I/O (which presides over the IO-500 supercomputing-oriented file service benchmark). In stark contrast, Panasas has exactly zero industry-standard benchmark results to substantiate their marketing-driven assertions.
But I suppose it’s understandable not to want to compare their new PanFS in any disadvantageous context. Their old Panasas ActiveStor Ultra was getting long in the tooth, and something had to be done. Their new architecture is formulaic: small files will be placed on ordinary (SAS?) SSD; large files go on large-capacity HDD; a smattering of somewhat more-expensive NVMe SSD’s are reserved for metadata; and lastly, an NVDIMM-based intent log that unavoidably will remind a great many people of the good old days at NetApp or Nimble. There isn’t a single new idea there. And what’s worse, it reflects a mindset oriented toward speeds-and-feeds derived from hardware specs.
If only we lived in a world where you could expect to get what the hardware is capable of delivering!
And that’s exactly why the architects of benchmarks like SPEC SFS, the IO500, and STAC put so much time and effort into simulating the real-world characteristics of application workloads. Whether you are running one intense workload over and over, or a mixed workload with many applications competing for resources, the result arriving at your file storage infrastructure inevitably will frustrate simplistic arithmetic based on the transfer rates for Reads and Writes from storage media.
PanFS is said to have been designed to marshal its multiple layers of storage media with different performance and capacity characteristics to be applied to different purposes. What we do not yet know is what would happen if the workload is hammering away at a zillion files of mixed sizes all in the same directory, or sprinkled randomly around a zillion subdirectories (not created during a contiguous time interval) at multiple depths in a deep directory tree scattered across their cluster. (I’m using “zillion” here to represent a suitably large number of indefinite scale, depending on your environment. Lately at WekaIO, we regularly encounter customers putting many millions of files into the same directory, and occasionally even upwards of a billion. But we know that may be extreme in Panasas’s experience of the market.)
Let’s take a look at some of the numbers Panasas has provided with their announcement. More complete details were available in the comparison made to IBM ESS (including a few specs for the IBM system we had to find elsewhere), so that is the one excerpted here.
You can see below that these are two very different hardware platforms. IBM is a dense HDD-based system while Panasas is a hybrid system with a higher proportion of SSD — each 4U “Ultra” building block = 24 HDD + 8 SSD. If the analysis goal was to selectively compare the metric of performance per HDD, which looks favorable to Panasas here, then consider the converse: the performance per SSD shows that IBM comes out significantly ahead. But it’s all a little silly given that we don’t even know whether the HDDs were 4500 rpm or 15000 rpm. Or what block sizes were used. Further, we all know that for dense platforms the network connectivity ends up being the bottleneck. Panasas has 8x25Gbit (a 12.5GB pipe) for every 24 HDD + 8 SSD, so the fact that they are driving 4GB/chassis is hardly something to crow about. Simply taking aggregate performance and then dividing by the number of drives in the super dense box is not a meaningful performance claim. Maybe if IBM had only half-filled their 336-drive system they would still have hit 24 GBytes/sec, putting them ahead of Panasas.
Consider also that HPC buyers have many other metrics. Density, for example, is very important for HPC environments, and you would need a lot more rack space for the Panasas system. Furthermore, the IBM system is all-inclusive while Panasas requires four additional Director nodes, bringing their full footprint to 18U for 96 HDD drives.
|Ratio HDD to
|Perf Claims (MB/s)||Perf
Rack U (MB/s)
|IBM GL4S||334||2||167:1||20 U||24,000||72||12,000||1,200|
|Panasas Ultra w/ PanFS||96||32||3:1||16 U||12,465||130||390||779|
Ultimately, though, these comparisons and arbitrary performance claims are not helpful to anybody. Without an understanding of the workload, the clients running the applications which create the workload, the file layout, and file system administrative decisions (e.g., one big communal file system or many separate mount points?), and the configuration of the network, these aren’t useful metrics. They’re just numbers.
Bottom line: it’s all well and good to recite the performance numbers for the storage media layers in a system architecture, but until there are some legitimate, published benchmarks of merit, the claim of being the “fastest” is simply bogus.
To learn more about WekaIO’s performance, click here and do a search on “performance” in the “white papers” asset category.