BeeGFS Parallel File System Explained

August 25, 2020

What is BeeGFS?
Why BeeGFS?
Disadvantages of BeeGFS
BeeGFS & AI
Comparing Parallel File Systems BeeGFS vs. WekaFS

What is BeeGFS?

BeeGFS is a parallel clustered file system, developed with a strong focus on performance and designed for very easy installation and management. It originated as an internal program at the Fraunhofer Center for HPC in 2005 and was originally known as the Fraunhofer filesystem.

If I/O intensive workloads are your problem, BeeGFS is often proposed as a solution because of its parallelism. A BeeGFS based storage system is currently ranked #7 on the IO500 behind Lustre, WEKA and Intel DAOS

Why BeeGFS?

BeeGFS transparently spreads user data across multiple servers. By increasing the number of servers and disks in the system, you can simply scale performance and capacity of the file system to the level that you need, seamlessly from small clusters up to enterprise-class systems with thousands of nodes. Similar to the Lustre file system, BeeGFS separates data services and metadata services. When a client has received the metadata information from the metadata servers, it can directly access the data. Unlike traditional NAS systems, this provides for higher performance.

Disadvantages of BeeGFS

BeeGFS is an open source project which is designed to cater to academic HPC environments, but it lacks many of the features required in an enterprise environment. The following provides a summary of limitations that BeeGFS suffers from

Does not support any kind of data protection such as erasure coding or distributed RAID.
Does not have file encryption, at rest or on-the-fly
No native support NVMe-over-Fabric. Need to pay extra for 3rd-party NVMe-over-Fabric layer
Needs separate management and metadata servers
Limited by legacy storage interfaces such as SAS, SATA, FC
Does not support enterprise features such as snapshots, backup, data tiering,
Does not support enterprise protocols such as Network File System (NFS) or SMB (requires separate services)

BeeGFS & AI

As noted previously, BeeGFS separates data and metadata into separate services allowing HPC clients to communicate directly with the storage servers. This was a common practise for parallel file systems developed in the past and is similar to both Lustre and IBM Spectrum Scale (GPFS). While separating data and metadata services was a significant improvement for large file I/O, it created a scenario where the metadata services then became the bottleneck. Newer workloads in AI and machine learning (ML) are very demanding on metadata services and many of the files are very tiny (4KB or below), consequently the metadata server is often the performance bottleneck and users will not enjoy the design benefits of a parallel file system like BeeGFS. Studying the IO500 numbers for BeeGFS, it is evident that it could not hit high IOPS performance, achieving a lower number on the md test (metadata test) than on the bw test (bandwidth test).

AI and ML workloads also require small file access with extreme low latency, unfortunately BeeGFS does not have support for new network protocols like NVMe-over-fabrics or NVIDIA® GPUDirect® Storage which deliver extremely low latency to GPU based systems. The result is that expensive GPU resources are starved of I/O resulting in long epoch times and inefficient utilization of very expensive GPU resources.

Additionally, most main-stream enterprise customers expect a certain level of data protection that BeeGFS was never designed for. BeeGFS is commonly referred to as a scratch-space file system, which means if there is a major crash then the analysis is simply restarted with no consideration for data protection. For many ML use cases, the cost of data acquisition is so high that it has to be fully protected. Imagine if the entire training set for an autonomous vehicle was lost? It would take millions of dollars and many man years to replace. Consequently enterprise customers look for some table stakes features that BeeGFS does not offer.

Some common enterprise tasks that are not possible with BeeGFS are,

User authentication – imagine if a disgruntled employee deleted a whole training set – it happens
Snapshots – Commonly are used as a way to save specific training runs for comparison with others
Backup – Immutable copies of data that can be retrieved at a later date
Backup – Saving data from major disaster and ensuring it can be recovered from
Encryption – Protect sensitive data (maybe patient MRI or XRay) from threat or rogue actors
Containerization – Integrate with container services for stateful storage
Quotas – Ensure groups are not consuming excessive storage services due to bad practises

Bottom line, BeeGFS was designed as a research environment file system but does not scale to the needs of commercial high performance computing (HPC), one of which is AI and ML

Comparing Parallel File Systems BeeGFS vs. WEKA

Architecture	ThinkParq (BeeGFS)	WEKA
Small Footprint Configuration	5 servers in 9RU	8 servers in 4RU
# of Server Nodes	2 to hundreds	8 to Thousands
Supported Storage Interfaces	Legacy SAS, SATA, FC	Natively NVMe
NVMe over Fabric	3rd-Party Add-on	Built-In
Optimized for Mixed Workloads	No	Yes
Protocol Support
POSIX	Yes	Yes
GPU Direct Storage	No	Yes
NFS	No	Yes
SMB	No	Yes, SMB 2.1
S3	No	Yes
Filesystem
Directories per Directory	No Data from Vendor	6.4T
Files per Directory	No Data from Vendor	6.4B
File Size	No Data from Vendor	4PB
Filesystem Size	No Data from Vendor	8EB (512PB on Flash)
Snapshots	No Data from Vendor	Thousands
CSI Plugin for Kubernetes	No	Yes
Security
Data Encryption	No	At-Rest and In-Flight
Performance
Read Throughput	25.2GB/s, 20 servers	56GB/s, 8 Servers
Write Throughput	24.8GB/s, 20 servers	20GB/s, 8 Servers
Read IOPS	No Data from Vendor	5.8M
Write IOPS	No Data from Vendor	1.6M
Single Mount Point, Full Coherency	No Data from Vendor	82GB/s
#1 on IO500 and SPEC	No	Yes

Learn how WEKA’s parallel file system delivers the highest performance for the most data-intensive workloads.

Additional Helpful Resources

FSx for Lustre
Learn About HPC Storage, HPC Storage Architecture and Use Cases
General Parallel File System (GPFS) Explained
5 Reasons Why IBM Spectrum Scale is Not Suitable for AI Workloads
Isilon vs. Flashblade vs. WEKA
NAS vs. SAN vs. DAS
What is Network File System?
Network File System (NFS) and AI Workloads
Block Storage vs. Object Storage
Introduction to Hybrid Cloud Storage
Worldwide Scale-out File-Based Storage 2019 Vendor Assessment Report
Gorilla Guide to The AI Revolution: For Those Who Are Solving Big Problems

BeeGFS Parallel File System Explained

What is BeeGFS?

Why BeeGFS?

Disadvantages of BeeGFS

BeeGFS & AI

Comparing Parallel File Systems BeeGFS vs. WEKA

Additional Helpful Resources

Share On Social:

Recommended Resources

Related Assets

Eight Plays to Turn GPU Capacity Into Durable AI Cloud Margin

The NAND Flash Shortage Survival Guide

The Impact of Storage on the AI Lifecycle