WEKA
Close

Cloud Storage vs Local Storage: Pros and Cons of Each

What is cloud storage?

Cloud storage is a method for storing data in a remote or offsite location that is hosted by a third party provider. Typically data stored in the cloud is accessed over the internet or across the network operated by the cloud provider.

What are the benefits of cloud storage?

The primary benefit that comes with cloud storage is extremely low cost made possible by massive scale. Cloud providers typically build extremely large data centers with many exabytes of capacity or more, and deliver the physical storage as a shared resource to organizations. As a result, cloud providers are able to build physical data storage at costs far lower than individual enterprises or data centers could typically deliver and pass some of these savings on to their customers.

Cloud storage is elastic, meaning that organizations can add or remove capacity any time they need and usually only pay for the amount of data storage capacity they use. Cloud data storage is typically simple to use, is highly flexible and can be integrated into many applications and workflows using CLI and API interfaces developed by the cloud storage provider.

What are some challenges of cloud storage?

The major challenges of cloud storage are the extra complexity involved with securing and controlling access to data in the cloud. Cloud providers typically follow a shared security model, which means that the cloud provider ensures security of the underlying physical infrastructure, while individual customers are responsible for the security of their own data. In this model, each customer must implement their own data security strategies in a way that meets their unique policies and requirements using a combination of capabilities from the cloud provider, and from third party data and security vendors. This shared model often leads to added complexity, particularly for enterprises, organizations in highly regulated industries (like healthcare and finance), or based in locations with specific legal and regulatory requirements (such as GDPR in the European market).

Cloud storage is also considered lower performing versus on-premises or local data storage due to the distributed architecture common in cloud providers, which adds latency and limits the amount of bandwidth or data IO available to applications.

What types of cloud storage are available?

There are primarily 3 types of cloud storage available for organizations today. For collaboration scenarios, organizations tend to rely on cloud file storage solutions that enable multiple users within an organization to access files and data from a common repository, typically over the internet.

For very large scale projects, organizations typically rely on cloud object storage solutions that come with the lowest cost per capacity and project elastic scaling capability to add many individual data objects simply and easily. Cloud object storage is the most commonly used method for storing data in the cloud to support typical cloud computing use cases.

For projects that require high performance storage in the cloud, organizations rely on data storage that is locally attached to the compute resources – usually in the form of memory or flash disks. This approach helps address the performance challenges common in other cloud storage solution by essentially delivering local storage – that is storage this closely coupled (or local) with the compute resources. This minimizes the latency and network penalties that stem from distributed cloud computing.

What are the best cloud storage use cases?

Organizations use cloud storage in many different scenarios, most commonly for analytics and data lakes, backup and disaster recovery, long term archive, collaboration, application data, IOT, and machine learning and AI.

Machine learning and AI

Most machine learning and AI development projects – such as generative AI, natural language processing, autonomous vehicle development, development of neural networks, speech recognition, and more – often rely on large data sets to support model development that now uses millions of parameters. With cloud storage, organizations can build, train, tuned their AI and machine learning models entirely in the cloud, enabling them to accelerate their overall time to market for new models and new ML and AI applications

Data lakes and analytics

By taking advantage of the massive scale, elasticity, and low cost of object storage in the cloud, organizations can build petabyte-scale data repositories (data lakes) and run analytics and queries against their data for a wide range of uses.

Backup and archive

Cloud storage is an ideal strategy for backup and long term archive requirements and. Cloud storage provides organizations with a low-cost approach to storing large quantities of data off-premises.

Disaster recovery

In the same way that cloud storage is very useful for backup and archive, it is also useful for disaster recovery. Most major cloud providers offer data resiliency capabilities (such as data replication) built in to their services and they enable organizations to architect multi-region data storage strategies to ensure data is protected in the event of a physical failure both on the customer premises or in the primary cloud datastore.

Collaboration

Cloud storage is an excellent option for organizations who work on projects, applications, or data that require collaboration. This could be teams collaborating on documents, artists developing new creative projects, legal or regulatory teams collaborating on a brief or a legal filing, healthcare experts collaborating on the best treatment strategy for a patient, and many other scenarios.

IOT and Edge Data

Organizations use the cloud to process, store, and analyze data generated by IOT and edge devices such as sensors, imagers, cameras, microscopes, videos and more. Data can be captured and processed locally and then stored in the cloud for further processing, analysis, and long term retention.

What is local storage?

Local storage is a method for storing computer data on resources that are on the organizations’ premises (in their data center), rather than stored in a remote location (such as the cloud).

What are the benefits of local storage?

Local data storage is most useful for organizations who have applications that need very high storage performance, or have very strict security, regulatory, and access control requirements. Organizations who rely on local data storage are able to build architectures that locate the data very close to their compute resources, reducing latency, and enabling much greater bandwidth and IO. For this reason, many high-performance compute (HPC) applications for research labs, drug discovery operations, and other highly specialized workloads rely on local data storage.

Organizations who deploy local storage are also able to fully control the physical and logical data resources, enabling them to deliver a consistent security model that they fully control. Many organizations who must meet tight security or regulatory requirements, such as SOC-1, SOC-2, HIPAA, PCI_DSS, or FIPS-140 continue to rely on local data storage.

What are some disadvantages of local storage?

The primary challenges with local storage stem from the fact that organizations must procure, deploy, and manage their own data storage. Procurement cycles for physical data storage can take a year or more and require a large upfront cost on the order of millions of dollars. Once deployed, these organizations must deploy resources to monitor, and manage the infrastructure, which takes away from resources that could be used to develop new capabilities or offerings for customers.

What are the types of local storage?

Local data storage tends to come in three varieties – block storage, file storage, and object storage, with block and file being the most commonly deployed technologies in the data center.

Block storage

Block storage is data that is organized into a uniform, logical size called a “block”. Each block is tagged with a unique identifier and stores the data wherever is most efficient for the storage system. Blocks are aggregated into volumes which can be accessed by applications. Applications access individual blocks on the storage systems directly,  enabling block storage systems to delivery extremely low latency for large volumes of data. However, they lack the typical metadata structures required by most  HPC and AI applications.

File storage

A file storage (or file system) organizes data in a hierarchical format of files and folders. Files vary in size and are tagged with extensive metadata for easy identification and retrieval. Applications access data in a file system using a protocol such as NFS, SMB, or POSIX. This is most recognizable form of data storage as it is common in personal computers, where file folders and directories are common

Object storage

Object storage is a storage architecture used to contain extremely large unstructured datasets. These systems rely on a distinct unit or “object” that contains the data, metadata, and a unique identifier for easy access and retrieval. Because objects are stored in a flat address space (as opposed to a hierarchy), they can elastically scale with the data set. This approach enables very low cost storage that scales to very large capacities, making it a better candidate for use in cloud storage rather than local storage.

When to use a combination of cloud and local storage

Most organizations today use combinations of cloud and local storage to meet their business needs, this is most commonly referred to as hybrid cloud storage. Hybrid cloud storage is an architecture that utilizes both cloud storage and data center storage (local storage) together to meet an overall objective such as disaster recovery or cloud bursting.