Elasticity for Your Data: What It Costs You, and What You can Do About It.
“Amazon EC2 changes the economics of computing by allowing you to pay only for capacity that you actually use.”
– Amazon, 2006
Elasticity is one of the most important properties in cloud computing. When we talk about the ability to pay as you go and never pay for resources you don’t use, we’re talking about elasticity. When we talk about the ability to access resources on-demand and to grow and shrink an environment with just a few clicks or an API call, we’re talking about elasticity. Even when we talk about deploying successful test applications into production using CI/CD, under the covers is…elasticity!
A useful way to think about elasticity is through two central outcomes – build in minutes instead of months and never pay for resources you’re not using. So from these outcomes, you get most of the elasticity properties we’re familiar with in the cloud including things like:
- Spin up new resources on-demand using a simple API call or a few clicks in the console.
- Add or remove resources on-demand – more commonly referred to as autoscaling
- Shift resources across infrastructure or locations to support demand.
- Add resources in new locations to support new demand or growing businesses
- Automatically manage underlying resources – more commonly known as infrastructure as code (IAC).
But hold on a second, most of these properties apply to compute, networking, and some databases, but do they really apply to data storage in the cloud? Let’s have a look.
Cloud object storage has a great story around elasticity when it comes to on-demand capacity scaling. You can easily spin up new Azure Blob storage or Amazon S3 buckets and they easily meet whatever capacity you need. Object store capacity will grow and shrink based on application needs. However, when your workloads have any kind of performance requirement, you’re looking at additional services – like Amazon S3 Express One Zone – to supplement the core object store to increase performance. The multiplication of storage tiers within a cloud deployment adds to the typical data copy and data management challenges most customers face.
Block storage offerings – disks or flash SSD devices attached to your cloud VMs – provide much greater performance, but come at a big price premium, and mainly work with applications that are built for block-level storage – databases, analytics, and apps like SAP, Oracle, and SAS, but beyond that you’ll need to look elsewhere.
File storage offerings are the workhorse for most applications in your cloud environment – business apps, file sharing and collaboration, HPC, and AI applications. File storage like Google Filestore, Azure NetApp Files, or Amazon FSx all scale up well. It’s easy to add capacity. However, most cloud file systems don’t offer the ability to scale back down, so in many scenarios, you actually are paying for capacity you don’t use. Or perhaps even worse, scale limits with most cloud file systems force customers to break data sets apart into multiple silos, adding complexity.
So when you look closely, the elasticity that the cloud promises has not been fully realized when it comes to storage. You can scale up…but not down. To scale for performance, you usually have to add extra capacity that you don’t need, or add extra caching and acceleration services on top of the core offering. A mixed workload or workflow that requires a data pipeline often requires two or three different cloud storage solutions, with data copied across each silo.
Data portability is now emerging as an important aspect of elasticity. Starting in 2025, new rules, such as the EU Data Act, will establish a new front in the cloud elasticity conversation. These new regulations establish a high bar for data protection, data sovereignty, and data portability. For example, by 2025 the Data Act will require cloud providers to enable their customers “to switch seamlessly (and eventually free of charge) between different cloud providers”. However, other than waiving data egress fees for customers who want to exit their cloud entirely, the options for real data portability are virtually non-existent today.
Here’s how WEKA is bringing compute elasticity to data
When we launched WEKA in 2017, at AWS re:Invent, the vision was to bring real elasticity to data storage in the cloud. Over the past seven years, the WEKA team has been constantly innovating jointly with our customers in the cloud to deliver on this vision. Here’s how we do that and how customers benefit from true cloud elasticity for their data.
Access resources on-demand: Getting started with WEKA in the cloud is just as easy as spinning up any other cluster of EC2 instances in the cloud. You can spin up the desired cluster size (based on your capacity and performance needs using this easy-to-use guide) and deploy the WEKA software to the cluster. It takes about 30 minutes using our Terraform deployment automation tools in AWS, Azure, GCP, or OCI.
Autoscaling…for data: By running on a cluster of instances in the cloud, WEKA enables you to take advantage of the full power and flexibility of compute autoscaling…for data. If you have a bursty workload or need to add more throughput or IO processing to your application, autoscaling groups add more instances to the WEKA cluster, driving greater application performance and low latency. When the project is complete, autoscaling will shrink that cluster back down to its original size. You can even archive your data and scale the entire WEKA environment away if you want. In this way, you only ever pay for the storage resources you need and never overprovision or add capacity simply to meet a performance target.
Scaling performance and capacity…independently
The single WEKA namespace combines low-cost object storage and high-performance flash storage in a single namespace. The WEKA software intelligently manages tiering between flash and object storage without the need for you to create rules or policies. This is a no-hassle way for you to get massive performance for your HPC and AI workloads without overpaying for storage or copying data across multiple storage silos to try to reduce costs. This also enables you to scale the capacity or performance tiers independently and bi-directionally. Need more performance? Scale the instances supporting the WEKA cluster. Need more capacity, scale the object storage attached to the WEKA single namespace. Did the peak performance requirement come and go? Scale the performance tier down (or even completely away) so you don’t pay for expensive flash resources when you’re not using them. In either way, it’s as simple as a few API calls.
Manage storage infrastructure as code
You can manage your WEKA deployment using Terraform infrastructure as code to provision cloud resources to support your WEKA environment, deploy a new WEKA environment, or scale an existing cluster. So you can integrate it into your overall data operations processes.
Data portability for hybrid, edge, and multi-cloud deployments
A popular feature in WEKA is snap to object, which creates a full self-describing snapshot of the WEKA environment. The WEKA snapshot includes all data and metadata, making it very simple to move your WEKA environment any time you need to. Using WEKA snap to object, you can move data from an on-prem data center into your cloud provider, or move your data from one cloud provider to another.
WEKA Brings Elasticity to Your Data
When you bring each of these capabilities together in a cohesive whole, you get a solution that enables the original promise of elasticity…now for your data. You can build in minutes rather than months – even for high-performance workloads like LLM model training, drug discovery, or autonomous vehicle development. You never pay for resources you don’t need or aren’t using. With WEKA, resource overprovisioning is a thing of the past; as is scaling up, only to be locked into a larger configuration than you really need.