The Evolving Landscape of AI Data Storage: Beyond the File System
AI training, a cornerstone of modern technological advancement, requires massive datasets and the ability to process them with unprecedented speed and efficiency. Traditionally, this has relied heavily on file systems, the backbone of data storage for decades. Though, the rapid evolution of AI models and frameworks is questioning the long-term suitability of this approach.
Jeff Denworth, co-founder of VAST Data, recently sparked a debate by arguing that “no one needs a file system for AI training.” His assertion, posted on X, highlights the growing adoption of object storage solutions within the AI ecosystem.
The Case for Multi-Protocol Storage
While it’s true that established players like DDN, NetApp, Pure, and WEKA are still prevalent in AI deployments, Denworth emphasizes the need for a more versatile approach. He points out that the reliance on file systems alone can become a major barrier to future-proofing investments, as AI frameworks evolve at a rapid pace.
“it’s not binary, it’s evolutionary. Historically, all of the AI training frameworks required a POSIX/file interface. Only companies developing their own frameworks would consider using object storage, and this is limited to the best of the best.”
– Jeff Denworth
Denworth advocates for “multi-protocol” storage solutions that seamlessly integrate both file system and object storage paradigms. This allows organizations to leverage the familiarity and functionality of file systems while simultaneously benefiting from the scalability, durability, and cost-effectiveness of object storage.
Object Storage Takes Center Stage
Recent advancements in object storage technology, including GPUDirect-like access facilities from Cloudian, MinIO, Nvidia, and Scality, are enabling direct data access from GPUs without the need for traditional data movement. This paradigm shift paves the way for even more efficient AI training.
Denworth’s insight confirms the growing trend: top-tier AI models are increasingly being trained directly from object storage.
“Of the top-tier (top ten worldwide) models I know of: VAST is being used for a very prominent model exclusively on VAST S3 at CoreWeave. We have a few other top-tier names starting to experiment.Azure Blob is being used for a very prominent model.Nvidia is training a very prominent model on S3-compatible storage.”
– Jeff Denworth
The Future of AI Data Storage
The shift towards object storage in AI training signifies a fundamental change in the data landscape. This shift empowers developers to leverage the latest advancements in hardware and software, ultimately accelerating the pace of AI innovation. As AI models become increasingly complex and data-intensive, the need for reliable, scalable, and cost-effective storage solutions will only grow.
Organizations that embrace multi-protocol storage architectures, such as those pioneered by VAST Data and others, will be well-positioned to navigate the evolving AI ecosystem and unlock the full potential of their data.
AI’s Next Frontier: Beyond Chatbots
chatgpt’s remarkable capabilities have ushered in a new era of generative AI, sparking a wave of excitement and speculation about its transformative potential. while chatbot applications undoubtedly capture the creativity, industry experts argue that the true impact of AI will extend far beyond conversational interfaces, necessitating a fundamental shift in how businesses manage and leverage data.
The Limitations of Chatbots as the AI End Game
Jeff Denworth, CEO of VAST Data, a company specializing in AI-driven data storage and processing, posits that while chatbots represent a meaningful advancement, they are merely a glimpse into the broader landscape of AI applications. “It’s always possible to integrate a solution; that never means it’s practical or efficient,” he asserts.
Denworth argues that the ability to process and analyse vast volumes of data in real time will be crucial for businesses seeking to harness the full potential of AI. He envisions a future were “AI embedding models can understand recency and relevance of all data as it’s being chunked and vectorized … where all data will be vectorized [with] trillions of vectors that need to be searchable in constant time nonetheless of vector space size.”
Scaling for the AI-Powered Enterprise
Denworth highlights the “disaggregated storage architecture” (DASE) as a key component in enabling this future. DASE allows for the independent scaling of compute and storage resources, providing the adaptability and performance necessary to handle the demands of AI workloads. “A system that can manage ingestion of hundreds of thousands to millions of files per second,process them and index them in real time … and also instantaneously propagate all data updates to the index so enterprises never see stale data. A system that doesn’t need expensive memory-based indices because legacy partitioning approaches are not efficient,”
he explains.
He further emphasizes the need for enterprise-grade data sources that can handle massive data volumes and ensure data integrity. “the underlying data sources need to be scalable AND enterprise grade … not sure where else you get this other than VAST,” he states.
The Rise of AI Agents and the Need for Agile Data Management
The emergence of “AI agents” – autonomous software entities that can perform complex tasks – is poised to revolutionize business operations.Nvidia, for instance, plans to deploy over 100 million AI agents to augment its workforce over the next few years.”
“You don’t think this will push boundaries of legacy storage and database systems?” Denworth asks.
denworth points to the increasing demand for computational resources, exemplified by Microsoft and BlackRock’s recent declaration of a joint fund dedicated to scaling AI infrastructure. He argues that this trend, coupled with the rise of “System Two” computing, which emphasizes long-term reasoning and complex problem-solving, will necessitate a paradigm shift in data management practices.
“The Stargate announcement will be the first of many….This is not exclusively for training. System Two/long-Thinking is going to change the world’s relationship with data and compel the need for even larger volumes of data,” he predicts.
Looking Ahead: A Future of Innovation
Despite the significant progress already made, Denworth remains convinced that the journey of AI innovation is far from over. “I can confidently say that we have the most inventive and most aspiring team in the business. Each customer interaction gives us more inspiration for the next ten years,” he states.
While he is hesitant to reveal future plans in detail, Denworth emphasizes that VAST Data will continue to push the boundaries of what’s possible in data management, driven by its commitment to empowering businesses with the tools they need to unlock the full potential of AI.
The rise of generative AI presents both exciting possibilities and critical challenges. As businesses navigate this dynamic landscape, a strategic approach to data management that prioritizes scalability, agility, and security will be paramount.
Computational Storage: Moving Beyond Traditional Limits
Computational storage is revolutionizing how we approach data processing by integrating compute resources directly into storage systems.
This approach, often called “disaggregation,” challenges the traditional paradigm of separating compute and storage. Instead, computational storage allows applications to run directly within the storage array, unlocking a new level of performance and efficiency.
The Shift From DAS to Shared Access
The core benefit of computational storage lies in its ability to enable shared data access across multiple machines.
“Shared data access across machines is tantamount to what we do. Modern machinery needs real-time access to petabytes to exabytes of data to get a global data understanding. You can’t pin that data to any one host. Where and how those functions run is just a packaging exercise … we like efficiency so the more we can collapse, the better … but DAS is the opposite of how we think. Disaggregation is not just possible, we’ve shown the world that it’s very practical to getting to radical levels of data access and data processing parallelism.”
This quote highlights a key difference between computational storage and direct-attached storage (DAS). While DAS restricts data access to a single server, computational storage promotes shared access, crucial for modern applications requiring global data understanding.
Sizing Compute Resources in Computational storage Arrays
Determining the optimal compute resource size for a computational storage array is a complex task.
Several factors play a role, including:
- I/O load
- query load
- Function velocity
- event notification activity
- QoS management
- RAS
“We’re learning more about sizing every day,” says a leading expert in computational storage, “I’m not sure we’ve got it all figured out since each new release is adding substantially new capability. This keeps the performance team on its toes … but we’re trying.”
The Evolution of Computational Storage
Computational storage is a rapidly evolving field with continuous advancements.
As technology progresses, the line between storage and compute will continue to blur, leading to even more elegant and efficient data management solutions.
The future of data processing lies in embracing the possibilities of computational storage. By integrating compute resources directly into storage systems, we can unlock unprecedented levels of performance, efficiency, and scalability, empowering organizations to harness the full potential of their data.