WEKA CHI eBook Data Moving and Storage

Recent years have seen a boom in the generation of data from a variety of new sources: connected devices, IoT, analytics, healthcare, smartphones, and much more. In fact, as of 2020, 90% of all data ever created had been created in the previous four years. Gaining insights from this data presents a tremendous opportunity for organizations to further their businesses, expand more quickly into new markets and to advance research in healthcare or climate - just to name a few. However, the challenge of managing the sheer amount of data being generated, coupled with the need to more quickly glean insights from it, has created an infrastructure nightmare. Organizations have been reporting unstructured data growth of over 50% year over year (Gartner, 2018), while at the same time, 79% of enterprise executives agree that not extracting value and insight from this data will lead to extinction for their businesses (Accenture, 2018). This data management problem is particularly acute in the areas of artificial intelligence/machine learning (AI/ML), life sciences (including genomics and microscopy), financial analytics, high-performance computing (HPC), and anywhere there are both extreme compute requirements and the need to store and analyze massive amounts of data.