Flow Big or Go Home: Move Over Data Lakes, Hello Data Oceans
2023 will see the growing adoption of AI in the enterprise, paving the way for the emergence of the Data Ocean
“The future will soon be a thing of the past.” – George Carlin
As any month, quarter, or year comes to a close, one must always plan for what’s next; traditionally in our industry many luminaries, industry pundits, and analysts mark the end of a year by sharing their predictions for the coming year. In the words of the great Yogi Berra, “it’s tough to make predictions, especially about the future.” Yet, some of the technology predictions can sound so obvious or cliche that they tend to lose some credibility, and that is often the risk anyone takes when sharing opinions about the year to come. In fact, this next prediction may make you feel as though you just read the ever popular statement, “data will grow in 2023”, but there is actually a great deal more to this next prediction because it sets up the foundation for another important enterprise trend we’ll see in the coming year.
The Growing Rise of Enterprise AI
In 2023 we believe we will continue to see the Rise of AI in the enterprise. Too cliche for you? Before you skip ahead to the next prediction, consider the long and relatively stable growth AI has had over the past 65 years. Yes, AI has been around for around 65 years, which may surprise many of you. The first use of AI was in the mid-50s and was based on the framework, “Computing Machinery and Intelligence” written in 1950 by mathematician and computer scientist Alan Turing. Many believe the start of AI can be traced back to 1955 when The Logic Theorist project, designed to mimic the problem solving skills of the human mind, launched with funding from the RAND Corporation. For the next several decades, AI was mostly reserved for applications in specialty labs funded by government or private research firms due to cost. That changed in the mid-2000s when GPUs started being used for applications requiring complex, simultaneous processing in an enterprise IT environment. Then, in 2009 two researchers at Stanford University published a paper about the technological gains in machine learning applications using GPUs. Over time, GPUs have been adopted for massively parallel processing of complex computations across demanding operations within the workloads of artificial intelligence, machine learning, and technical computation.
When looking at Gartner’s Top Strategic Technology Trends for 2023, AI and Adaptive AI are two of the main highlights. With regards to AI, we certainly will see an expansion of the use cases of AI trust, risk, detection, security, and beyond. But it is the emergence of Adaptive AI that will potentially have market changing implications for many organizations adopting it. What is Adaptive AI? According to Gartner, “Adaptive AI systems support a decision-making framework centered around making faster decisions while remaining flexible to adjust as issues arise. These systems aim to continuously learn based on new data at runtime to adapt more quickly to changes in real-world circumstances. The AI engineering framework can help orchestrate and optimize applications to adapt to, resist or absorb disruptions, facilitating the management of adaptive systems.” While traditional AI training models were honed for model accuracy by the AI team, adaptive AI is a constantly evolving training model that uses real-time feedback continuously in order to retrain models and learn based on new real-time data.
Moreover, with the greater availability and accessibility of GPU technology we believe we will see AI rise in adoption across the enterprise in 2023. One data point signaling growth in the GPU space is Oracle’s 2022 announcement of its plans to add tens of thousands more Nvidia GPUs, including A100 and the upcoming H100, to OCI.
Unleashing the power of GPUs in the enterprise is like unlocking supercomputing capabilities that were only found in national labs and high-end academic research facilities. The challenge many face with this kind of unbridled power is not only having the capacity to feed the GPU data, but also the data pipeline to support it. For about a decade or so we have been using the term “data lake” to define a central location for a variety of applications to access data more efficiently and without having to create multiple copies of the same data for the purposes of extracting value. While data lakes have proven to be quite useful for general file and workloads not impacted by latency, the GPU appetite for data may certainly exceed the amount of data that is available from a data lake and the typical protocol stack for the AI workloads of the future. Studies have shown GPUs may be left idle for up to 70% of the time due to an inability to deliver data to the cores. In order to take full advantage of GPU power, the volume and velocity of data available to the AI engines will require a data delivery platform to keep these GPUs active and busy, which would potentially provide continuous analytics for organizations.
The Emergence of the Data Ocean
This leads to the next prediction for 2023: the expansion of the data lake to the data ocean. Not simply a larger expanse to retain data, a data ocean also supports the “floodgates” to deliver data through an information data pipeline to the ever hungry cores of the GPUs. While this may be a slower growth area than the rise of enterprise AI, the evolution to the enterprise data ocean will inevitably follow given the emergence of Adaptive AI.
In closing, while we used “data is growing in 2023” in a tongue-in-cheek way to set up these predictions, the truth behind that statement is far more complex. The game changer for 2023 will not be the sheer quantity of enterprise data available, but the quality of insights that can be mined from all that data. Adaptive AI and the business outcomes enterprises seek will help shape the future for many years to come. However, in order for AI to fuel progress, we must first ensure the AI engines of the world are fed by data pipelines running at an optimal velocity to keep the GPU cores busy without wasting valuable compute cycles in the process. Solving that challenge is where the data ocean will prove to be valuable.