HPC Cloud (What Is It & What Do You Need To Know About It?)

Rahul Patwardhan. September 24, 2021
HPC Cloud (What Is It & What Do You Need To Know About It?)

Wondering about high-performance computing (HPC) and cloud computing? We explain how high-performance computing works in the cloud.

How do HPC and cloud computing work together? High-performance computing refers to networking several computers together in a cluster and aggregating their computational power to perform complex calculations at high speeds. Cloud computing gives organizations the ability to scale their HPC applications.

What Is High-Performance Computing on the Cloud?

The advent of modern cloud computing has dovetailed with many technologies that were seen as difficult to attain. Applications in areas such as machine learning, artificial intelligence (AI), analytics, and storage have evolved by leaps and bounds due to the advances in cloud computing. This has led to demands for faster, more efficient, and more comprehensive cloud environments.

Enter high-performance computing (HPC). HPC is a practice of combining computational power across different computers, servers, or clusters to power high-demand workloads that would be impossible to manage on traditional technology. Unlike a supercomputer or hardware-accelerated machine, high-performance computing specifically emphasizes using distributed resources to combine storage, applications, computational power, and network resources to accomplish tasks not attainable otherwise.

In many ways, a cloud-based HPC can be considered like any other technology stack; it includes the following components:

  • High-Performance Hardware: The underlying hardware of a cloud environment is critical to the success of HPC. Fast-access storage, optimized network channels, accelerated processing, and targeted hardware implementation, e.g., , GPU clusters for massive parallel computation, fuel HPC.
  • Flexible Cloud Environments: Cloud infrastructure is notable for its scalability and responsiveness, particularly to rapidly changing conditions or resiliency needs. HPC as a practice is predicated on optimized systems and having a cloud environment that supports changing workloads is critical for success.

    This is especially true when implementing particular cloud configurations. For example, most of us are familiar with public cloud environments and how rapidly they can scale compared to a more expensive, dedicated, and powerful private cloud service. Modern hybrid environments that combine aspects of public and private clouds can power HPC systems even more effectively and efficiently.

  • Buying vs. Building Infrastructure: Building IT infrastructure is costly, time-consuming, and fraught with long-term obligations (compliance, security, maintenance, and continuous monitoring). This fact is exponentially more troublesome when trying to field a system suitable for HPC. A cloud environment from a third-party provider can offload those challenges without forcing you to compromise performance or security.

Considering these factors, it’s simple to see why most HPC relies on the cloud.

The Benefits and Challenges of HPC on the Cloud

The benefits of deploying cloud-based HPC are perhaps a bit obvious, but it is essential to note some of the more significant advantages, like the following:

  • Capacity for Heavy Workloads: Modern high-performance computing emphasizes the most computationally challenging tasks like running machine learning algorithms or neural networks, performing genomic sequencing, or analyzing terabytes of data for analytics. Cloud-based HPC supports the computational workloads that make such applications possible.
  • Optimal Cloud Bursting: Not all workloads are created equal, and not all tasks require the same resources from day to day or even hour to hour. Hybrid cloud environments allow for the rapid provisioning of cloud resources through a process of cloud “bursting.” Using bursts, you can fuel your HPC workloads with expanded resources as you need them and then throttle back when those resources are not required.
  • Dedicated Hardware Configurations: HPC cloud providers should offer dedicated hardware configurations to ensure performance. These configurations can include several kinds of hardware: high-speed network controllers, GPU-accelerated processing, or dedicated fiber connectivity for ultra-low latency. HPC cloud providers should give you the opportunity to purchase the infrastructure that you need, not simply what is available.
  • Backups, Recovery, and Resiliency: Data is an important part of a HPC workload. An HPC cloud environment can support impressive performance during computation, backup, and recovery. For example, you can purchase cloud resources that automatically backup to hot and cold storage for long-term security. In the case of hot backups, HPC clouds can support immediate recovery in the event of an emergency or catastrophe.

As with any IT system, there are always challenges as well. Some of these challenges include the following:

  • Cost: HPC cloud infrastructure is an investment. While it can prove more cost-effective in the long run, be aware that there isn’t a way around investing significant resources in cloud HPC for your organization. Of course, by the time you are considering cloud HPC, you are probably aware and willing to see it for what it is—an investment.
  • Data Egress: Even with high-performance systems, moving data to and from the cloud takes time. Using hybrid or private cloud environments can mitigate some of this challenge.
  • Organization of Cloud Resources: While streamlining cloud file systems or applications evenly across private and public clouds seems appealing, actually configuring those systems to function effectively is complicated. You or your provider should have a handle on the ins and outs of your system and how to allocate resources and responsibilities to different cloud nodes or services.

Most organizations look to established cloud providers like Google, Microsoft, or Amazon for general-purpose HPC. These environments can support many workloads for things like analytics and data analysis. However, specific applications will often call for tailored solutions with purpose-built software to facilitate complex processing tasks.

Power Your Innovative Research Using HPC Cloud Systems from WEKA

Not all high-performance computing systems are created equal. While general-purpose vendors might work well for general-purpose applications, hundreds of research groups and businesses are working at the cutting edge of innovation. Applications in life sciences, genomic sequencing, machine learning, AI, and advanced analytics need an expertly designed HPC cloud system that specifically meets their needs.

WEKA builds cloud-based infrastructure to support these innovations. From hardware configurations to our cloud-based WekaFS, we support the most advanced cloud workloads happening today.

Critical features of the WEKA system include the following:

  • Streamlined and fast cloud file systems to combine multiple sources into a single high-performance computing system
  • Industry-best, GPUDirect Performance (113 Gbps for a single DGX-2 and 162 Gbps for a single DGX A100)
  • In-flight and at-rest encryption for GRC requirements
  • Agile access and management for edge, core, and cloud development
  • Scalability up to exabytes of storage across billions of files

Contact our experts today to learn more about how WEKA can support your revolutionary research with high-performance cloud systems.

Additional Helpful Resources

HPC for Life Sciences
High Performance Data Analytics
How Fast is Weka
HPC Architecture
HPC Storage Explained
HPC Storage Use Cases
Learn About HPC Storage, HPC Storage Architecture and Use Cases