{"id":61903,"date":"2025-08-06T05:42:07","date_gmt":"2025-08-06T05:42:07","guid":{"rendered":"https:\/\/www.newsbeep.com\/us\/61903\/"},"modified":"2025-08-06T05:42:07","modified_gmt":"2025-08-06T05:42:07","slug":"helping-data-storage-keep-up-with-the-ai-revolution-mit-news","status":"publish","type":"post","link":"https:\/\/www.newsbeep.com\/us\/61903\/","title":{"rendered":"Helping data storage keep up with the AI revolution | MIT News"},"content":{"rendered":"<p>Artificial intelligence is changing the way businesses store and access their data. That\u2019s because traditional data storage systems were designed to handle simple commands from a handful of users at once, whereas today, AI systems with millions of agents need to continuously access and process large amounts of data in parallel. Traditional data storage systems now have layers of complexity, which slows AI systems down because data must pass through multiple tiers before reaching the graphical processing units (GPUs) that are the brain cells of AI.<\/p>\n<p>Cloudian, co-founded by Michael Tso \u201993, SM \u201993 and Hiroshi Ohta, is helping storage keep up with the AI revolution. The company has developed a scalable storage system for businesses that helps data flow seamlessly between storage and AI models. The system reduces complexity by applying parallel computing to data storage, consolidating AI functions and data onto a single parallel-processing platform that stores, retrieves, and processes scalable datasets, with direct, high-speed transfers between storage and GPUs and CPUs.<\/p>\n<p>Cloudian\u2019s integrated storage-computing platform simplifies the process of building commercial-scale AI tools and gives businesses a storage foundation that can keep up with the rise of AI.<\/p>\n<p>\u201cOne of the things people miss about AI is that it\u2019s all about the data,\u201d Tso says. \u201cYou can\u2019t get a 10 percent improvement in AI performance with 10 percent more data or even 10 times more data \u2014 you need 1,000 times more data. Being able to store that data in a way that\u2019s easy to manage, and in such a way that you can embed computations into it so you can run operations while the data is coming in without moving the data \u2014 that\u2019s where this industry is going.\u201d<\/p>\n<p>From MIT to industry<\/p>\n<p>As an undergraduate at MIT in the 1990s, Tso was introduced by Professor William Dally to parallel computing \u2014 a type of computation in which many calculations occur simultaneously. Tso also worked on parallel computing with Associate Professor Greg Papadopoulos.<\/p>\n<p>\u201cIt was an incredible time because most schools had one super-computing project going on \u2014 MIT had four,\u201d Tso recalls.<\/p>\n<p>As a graduate student, Tso worked with MIT senior research scientist David Clark, a computing pioneer who contributed to the internet\u2019s early architecture, particularly the transmission control protocol (TCP) that delivers data between systems.<\/p>\n<p>\u201cAs a graduate student at MIT, I worked on disconnected and intermittent networking operations for large scale distributed systems,\u201d Tso says. \u201cIt\u2019s funny \u2014 30 years on, that\u2019s what I\u2019m still doing today.\u201d<\/p>\n<p>Following his graduation, Tso worked at Intel\u2019s Architecture Lab, where he invented data synchronization algorithms used by Blackberry. He also created specifications for Nokia that ignited the ringtone download industry. He then joined Inktomi, a startup co-founded by Eric Brewer SM \u201992, PhD \u201994 that pioneered search and web content distribution technologies.<\/p>\n<p>In 2001, Tso started Gemini Mobile Technologies with Joseph Norton \u201993, SM \u201993 and others. The company went on to build the world\u2019s largest mobile messaging systems to handle the massive data growth from camera phones. Then, in the late 2000s, cloud computing became a powerful way for businesses to rent virtual servers as they grew their operations. Tso noticed the amount of data being collected was growing far faster than the speed of networking, so he decided to pivot the company.<\/p>\n<p>\u201cData is being created in a lot of different places, and that data has its own gravity: It\u2019s going to cost you money and time to move it,\u201d Tso explains. \u201cThat means the end state is a distributed cloud that reaches out to edge devices and servers. You have to bring the cloud to the data, not the data to the cloud.\u201d<\/p>\n<p>Tso officially launched Cloudian out of Gemini Mobile Technologies in 2012, with a new emphasis on helping customers with scalable, distributed, cloud-compatible data storage.<\/p>\n<p>\u201cWhat we didn\u2019t see when we first started the company was that AI was going to be the ultimate use case for data on the edge,\u201d Tso says.<\/p>\n<p>Although Tso\u2019s research at MIT began more than two decades ago, he sees strong connections between what he worked on and the industry today.<\/p>\n<p>\u201cIt\u2019s like my whole life is playing back because David Clark and I were dealing with disconnected and intermittently connected networks, which are part of every edge use case today, and Professor Dally was working on very fast, scalable interconnects,\u201d Tso says, noting that Dally is now the senior vice president and chief scientist at the leading AI company NVIDIA. \u201cNow, when you look at the modern NVIDIA chip architecture and the way they do interchip communication, it\u2019s got Dally\u2019s work all over it. With Professor Papadopoulos, I worked on accelerate application software with parallel computing hardware without having to rewrite the applications, and that\u2019s exactly the problem we are trying to solve with NVIDIA. Coincidentally, all the stuff I was doing at MIT is playing out.\u201d<\/p>\n<p>Today Cloudian\u2019s platform uses an object storage architecture in which all kinds of data \u2014documents, videos, sensor data \u2014 are stored as a unique object with metadata. Object storage can manage massive datasets in a flat file stucture, making it ideal for unstructured data and AI systems, but it traditionally hasn\u2019t been able to send data directly to AI models without the data first being copied into a computer\u2019s memory system, creating latency and energy bottlenecks for businesses.<\/p>\n<p>In July, Cloudian announced that it has extended its object storage system with a vector database that stores data in a form which is immediately usable by AI models. As the data are ingested, Cloudian is computing in real-time the vector form of that data to power AI tools like recommender engines, search, and AI assistants. Cloudian also announced a partnership with NVIDIA that allows its storage system to work directly with the AI company\u2019s GPUs. Cloudian says the new system enables even faster AI operations and reduces computing costs.<\/p>\n<p>\u201cNVIDIA contacted us about a year and a half ago because GPUs are useful only with data that keeps them busy,\u201d Tso says. \u201cNow that people are realizing it\u2019s easier to move the AI to the data than it is to move huge datasets. Our storage systems embed a lot of AI functions, so we\u2019re able to pre- and post-process data for AI near where we collect and store the data.\u201d<\/p>\n<p>AI-first storage<\/p>\n<p>Cloudian is helping about 1,000 companies around the world get more value out of their data, including large manufacturers, financial service providers, health care organizations, and government agencies.<\/p>\n<p>Cloudian\u2019s storage platform is helping one large automaker, for instance, use AI to determine when each of its manufacturing robots need to be serviced. Cloudian is also working with the National Library of Medicine to store research articles and patents, and the National Cancer Database to store DNA sequences of tumors \u2014 rich datasets that AI models could process to help research develop new treatments or gain new insights.<\/p>\n<p>\u201cGPUs have been an incredible enabler,\u201d Tso says. \u201cMoore\u2019s Law doubles the amount of compute every two years, but GPUs are able to parallelize operations on chips, so you can network GPUs together and shatter Moore\u2019s Law. That scale is pushing AI to new levels of intelligence, but the only way to make GPUs work hard is to feed them data at the same speed that they compute \u2014 and the only way to do that is to get rid of all the layers between them and your data.\u201d<\/p>\n","protected":false},"excerpt":{"rendered":"Artificial intelligence is changing the way businesses store and access their data. That\u2019s because traditional data storage systems&hellip;\n","protected":false},"author":2,"featured_media":61904,"comment_status":"","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[46],"tags":[45921,45920,191,45919,74],"class_list":{"0":"post-61903","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-computing","8":"tag-ai-storage","9":"tag-cloudian","10":"tag-computing","11":"tag-michael-tso","12":"tag-technology"},"_links":{"self":[{"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/posts\/61903","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/comments?post=61903"}],"version-history":[{"count":0,"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/posts\/61903\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/media\/61904"}],"wp:attachment":[{"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/media?parent=61903"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/categories?post=61903"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/tags?post=61903"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}