{"id":114106,"date":"2025-09-02T21:38:07","date_gmt":"2025-09-02T21:38:07","guid":{"rendered":"https:\/\/www.newsbeep.com\/au\/114106\/"},"modified":"2025-09-02T21:38:07","modified_gmt":"2025-09-02T21:38:07","slug":"world-models-an-old-idea-in-ai-mount-a-comeback","status":"publish","type":"post","link":"https:\/\/www.newsbeep.com\/au\/114106\/","title":{"rendered":"\u2018World Models,\u2019 an Old Idea in AI, Mount a Comeback"},"content":{"rendered":"<p class=\"BodyA\">The latest ambition of artificial intelligence research \u2014 particularly within the labs seeking \u201cartificial general intelligence,\u201d or AGI \u2014 is something called a world model: a representation of the environment that an AI carries around inside itself like a computational snow globe. The AI system can use this simplified representation to evaluate predictions and decisions before applying them to its real-world tasks. The deep learning luminaries Yann LeCun (of Meta), Demis Hassabis (of Google DeepMind) and Yoshua Bengio (of Mila, the Quebec Artificial Intelligence Institute) all believe world models are essential for building AI systems that are truly <a href=\"https:\/\/www.quantamagazine.org\/new-theory-suggests-chatbots-can-understand-text-20240122\/\" rel=\"nofollow noopener\" target=\"_blank\">smart<\/a>, <a href=\"https:\/\/www.quantamagazine.org\/ai-changes-science-and-math-forever-20250430\/\" rel=\"nofollow noopener\" target=\"_blank\">scientific<\/a> and <a href=\"https:\/\/www.quantamagazine.org\/what-does-it-mean-to-align-ai-with-human-values-20221213\/\" rel=\"nofollow noopener\" target=\"_blank\">safe<\/a>. <\/p>\n<p class=\"BodyA\">The fields of psychology, robotics and machine learning have each been using some version of the concept for decades. You likely have a world model running inside your skull right now \u2014 it\u2019s how you know not to step in front of a moving train without needing to run the experiment first. <\/p>\n<p class=\"BodyA\">So does this mean that AI researchers have finally found a <a href=\"https:\/\/www.quantamagazine.org\/what-the-most-essential-terms-in-ai-really-mean-20250430\/\" rel=\"nofollow noopener\" target=\"_blank\">core concept<\/a> whose meaning everyone can agree upon? As a famous physicist <a href=\"https:\/\/www.google.com\/books\/edition\/Surely_You_re_Joking_Mr_Feynman\/_gA_DwAAQBAJ\" rel=\"nofollow noopener\" target=\"_blank\">once wrote<\/a>: Surely you\u2019re joking. A world model may sound straightforward \u2014 but as usual, <a href=\"https:\/\/lingo.csail.mit.edu\/blog\/world_models\" rel=\"nofollow noopener\" target=\"_blank\">no one can agree on the details<\/a>. What gets represented in the model, and to what level of fidelity? Is it innate or learned, or some combination of both? And how do you detect that it\u2019s even there at all?<\/p>\n<p class=\"BodyA\">It helps to know where the whole idea started. In 1943, a dozen years before the term \u201cartificial intelligence\u201d was coined, a 29-year-old Scottish psychologist named Kenneth Craik published an <a href=\"https:\/\/www.google.com\/books\/edition\/The_Nature_of_Explanation\/wT04AAAAIAAJ?hl=en\" rel=\"nofollow noopener\" target=\"_blank\">influential monograph<\/a> in which he mused that \u201cif the organism carries a \u2018small-scale model\u2019 of external reality \u2026 within its head, it is able to try out various alternatives, conclude which is the best of them \u2026 and in every way to react in a much fuller, safer, and more competent manner.\u201d Craik\u2019s notion of a mental model or simulation presaged the \u201c<a href=\"https:\/\/www.cs.princeton.edu\/~rit\/geo\/Miller.pdf\" rel=\"nofollow noopener\" target=\"_blank\">cognitive revolution<\/a>\u201d that transformed psychology in the 1950s and still rules the cognitive sciences today. What\u2019s more, it directly linked cognition with computation: Craik considered the \u201cpower to parallel or model external events\u201d to be \u201cthe fundamental feature\u201d of both \u201cneural machinery\u201d and \u201ccalculating machines.\u201d<\/p>\n<p class=\"BodyA\">The nascent field of artificial intelligence eagerly adopted the world-modeling approach. In the late 1960s, an AI system called <a href=\"https:\/\/en.wikipedia.org\/wiki\/SHRDLU\" rel=\"nofollow noopener\" target=\"_blank\">SHRDLU<\/a> wowed observers by using a rudimentary \u201cblock world\u201d to answer commonsense questions about tabletop objects, like \u201cCan a pyramid support a block?\u201d But these handcrafted models couldn\u2019t scale up to handle the complexity of more realistic settings. By the late 1980s, the AI and robotics pioneer Rodney Brooks had given up on world models completely, famously asserting that \u201cthe world is its own best model\u201d and \u201cexplicit representations \u2026 simply get in the way.\u201d<\/p>\n<p class=\"BodyA\">It took the rise of machine learning, especially deep learning based on artificial neural networks, to breathe life back into Craik\u2019s brainchild. Instead of relying on brittle hand-coded rules, deep neural networks could build up internal approximations of their training environments <a href=\"https:\/\/worldmodels.github.io\/\" rel=\"nofollow noopener\" target=\"_blank\">through trial and error<\/a> and then use them to accomplish narrowly specified tasks, such as driving a virtual race car. In the past few years, as the large language models behind chatbots like ChatGPT began to demonstrate <a href=\"https:\/\/www.quantamagazine.org\/the-unpredictable-abilities-emerging-from-large-ai-models-20230316\/\" rel=\"nofollow noopener\" target=\"_blank\">emergent capabilities<\/a> that they weren\u2019t explicitly trained for \u2014 like inferring movie titles from strings of emojis, or <a href=\"https:\/\/arxiv.org\/abs\/2210.13382\" rel=\"nofollow noopener\" target=\"_blank\">playing the board game Othello<\/a> \u2014 world models provided a convenient explanation for the mystery. To prominent AI experts such as Geoffrey Hinton, Ilya Sutskever and Chris Olah, it was obvious: Buried somewhere deep within an LLM\u2019s thicket of virtual neurons must lie \u201ca small-scale model of external reality,\u201d just as Craik imagined.<\/p>\n<p>        <img loading=\"lazy\" width=\"704\" height=\"704\" src=\"https:\/\/www.newsbeep.com\/au\/wp-content\/uploads\/2025\/09\/World-models-Detail.webp.webp\" class=\"block fit-x fill-h fill-v is-loaded mxa large-print-img\" alt=\"\" decoding=\"async\"  \/>    <\/p>\n<p class=\"BodyA\">The truth, at least so far as we know, is less impressive. Instead of world models, today\u2019s generative AIs appear to learn \u201cbags of heuristics\u201d: scores of disconnected rules of thumb that can approximate responses to specific scenarios, but don\u2019t cohere into a consistent whole. (Some may actually contradict each other.) It\u2019s a lot like the parable of the blind men and the elephant, where each man only touches one part of the animal at a time and fails to apprehend its full form. One man feels the trunk and assumes the entire elephant is snakelike; another touches a leg and guesses it\u2019s more like a tree; a third grasps the elephant\u2019s tail and says it\u2019s a rope. When researchers <a href=\"https:\/\/arxiv.org\/abs\/2406.03689v3\" rel=\"nofollow noopener\" target=\"_blank\">attempt<\/a> to recover evidence of a world model from within an LLM \u2014 for example, a coherent computational representation of an Othello game board \u2014 they\u2019re looking for the whole elephant. What they find instead is a bit of snake here, a chunk of tree there, and some rope.<\/p>\n<p class=\"BodyA\">Of course, such heuristics are hardly worthless. LLMs can encode untold sackfuls of them within their trillions of parameters \u2014 and as the old saw goes, quantity has a quality all its own. That\u2019s what makes it possible to train a language model to generate nearly perfect directions between any two points in Manhattan without learning a coherent world model of the entire street network in the process, as researchers from Harvard University and the Massachusetts Institute of Technology <a href=\"https:\/\/arxiv.org\/abs\/2406.03689\" rel=\"nofollow noopener\" target=\"_blank\">recently discovered<\/a>. <\/p>\n<p class=\"BodyA\">So if bits of snake, tree and rope can do the job, why bother with the elephant? In a word, robustness: When the researchers threw their Manhattan-navigating LLM a mild curveball by randomly blocking 1% of the streets, its performance cratered. If the AI had simply encoded a street map whose details were consistent \u2014 instead of an immensely complicated, corner-by-corner patchwork of conflicting best guesses \u2014 it could have easily rerouted around the obstructions.<\/p>\n<p class=\"BodyA\">Given the benefits that even simple world models can confer, it\u2019s easy to understand why every large AI lab is desperate to develop them \u2014 and why academic researchers are increasingly interested in <a href=\"https:\/\/www.worldmodelworkshop.org\/\" rel=\"nofollow noopener\" target=\"_blank\">scrutinizing them<\/a>, too. Robust and verifiable world models could uncover, if not the El Dorado of AGI, then at least a scientifically plausible tool for extinguishing AI hallucinations, enabling reliable reasoning, and increasing the interpretability of AI systems.<\/p>\n<p class=\"BodyA\">That\u2019s the \u201cwhat\u201d and \u201cwhy\u201d of world models. The \u201chow,\u201d though, is still anyone\u2019s guess. Google DeepMind and OpenAI are betting that with enough \u201cmultimodal\u201d training data \u2014 like video, 3D simulations, and other input beyond mere text \u2014 a world model will spontaneously congeal within a neural network\u2019s statistical soup. Meta\u2019s LeCun, meanwhile, thinks that an entirely new (and non-generative) AI architecture will provide the necessary scaffolding. In the quest to build these computational snow globes, no one has a crystal ball \u2014 but the prize, for once, may just be worth the hype.<\/p>\n","protected":false},"excerpt":{"rendered":"The latest ambition of artificial intelligence research \u2014 particularly within the labs seeking \u201cartificial general intelligence,\u201d or AGI&hellip;\n","protected":false},"author":2,"featured_media":114107,"comment_status":"","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[20],"tags":[256,254,255,64,63,105],"class_list":{"0":"post-114106","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-artificial-intelligence","8":"tag-ai","9":"tag-artificial-intelligence","10":"tag-artificialintelligence","11":"tag-au","12":"tag-australia","13":"tag-technology"},"_links":{"self":[{"href":"https:\/\/www.newsbeep.com\/au\/wp-json\/wp\/v2\/posts\/114106","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.newsbeep.com\/au\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.newsbeep.com\/au\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/au\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/au\/wp-json\/wp\/v2\/comments?post=114106"}],"version-history":[{"count":0,"href":"https:\/\/www.newsbeep.com\/au\/wp-json\/wp\/v2\/posts\/114106\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/au\/wp-json\/wp\/v2\/media\/114107"}],"wp:attachment":[{"href":"https:\/\/www.newsbeep.com\/au\/wp-json\/wp\/v2\/media?parent=114106"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.newsbeep.com\/au\/wp-json\/wp\/v2\/categories?post=114106"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.newsbeep.com\/au\/wp-json\/wp\/v2\/tags?post=114106"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}