{"id":115641,"date":"2025-08-28T06:41:08","date_gmt":"2025-08-28T06:41:08","guid":{"rendered":"https:\/\/www.newsbeep.com\/us\/115641\/"},"modified":"2025-08-28T06:41:08","modified_gmt":"2025-08-28T06:41:08","slug":"how-do-you-teach-an-ai-model-to-reason-with-humans","status":"publish","type":"post","link":"https:\/\/www.newsbeep.com\/us\/115641\/","title":{"rendered":"How Do You Teach an AI Model to Reason? With Humans"},"content":{"rendered":"<p>AI models are advancing at a rapid rate and scale.<\/p>\n<p>But what might they lack that (most) humans don\u2019t? Common sense: an understanding, developed through real-world experiences, that birds can\u2019t fly backwards, mirrors are reflective and ice melts into water.<\/p>\n<p>While such principles seem obvious to humans, they must be taught to AI models tasked with accurately answering complex questions and navigating unpredictable physical environments, such as industrial warehouses or roads.<\/p>\n<p>NVIDIA is tackling this challenge by developing a set of tests to coach AI models on the limitations of the physical world. In other words, to teach AI common sense.<\/p>\n<p>These tests are used to develop <a target=\"_blank\" href=\"https:\/\/www.nvidia.com\/en-us\/glossary\/ai-reasoning\/\" rel=\"nofollow noopener\">reasoning models<\/a> such as <a target=\"_blank\" href=\"https:\/\/build.nvidia.com\/nvidia\/cosmos-reason1-7b\" rel=\"nofollow noopener\">NVIDIA Cosmos Reason<\/a>, an open reasoning <a target=\"_blank\" href=\"https:\/\/www.nvidia.com\/en-us\/glossary\/vision-language-models\/\" rel=\"nofollow noopener\">vision language model<\/a> (VLM) used for <a target=\"_blank\" href=\"https:\/\/www.nvidia.com\/en-us\/glossary\/generative-physical-ai\/?ncid=pa-srch-goog-138513&amp;_bt=748154112134&amp;_bk=physical%20ai%20nvidia&amp;_bm=p&amp;_bn=g&amp;_bg=179409713478&amp;gad_source=1&amp;gad_campaignid=22480285813&amp;gbraid=0AAAAAD4XAoHogoa456rW_E9u7Z3k7OUSx&amp;gclid=Cj0KCQjwtMHEBhC-ARIsABua5iRWiam2QmYiVoUT63BWbRkpS2b4zLUARMxJBQdc7IsT5avAdKF3wCAaAhZnEALw_wcB\" rel=\"nofollow noopener\">physical AI<\/a> applications that are proficient in generating temporally grounded responses. Cosmos Reason just topped the <a target=\"_blank\" href=\"https:\/\/huggingface.co\/spaces\/facebook\/physical_reasoning_leaderboard\" rel=\"nofollow noopener\">physical reasoning leaderboard<\/a> on Hugging Face.<\/p>\n<p>Cosmos Reason is unique compared with previous VLMs as it\u2019s designed to accelerate physical AI development for fields such as robotics, autonomous vehicles and smart spaces. The model can infer and reason through unprecedented scenarios using physical common-sense knowledge.<\/p>\n<p>For models to understand complex environments \u2014 including industrial spaces and laboratories \u2014 they must start small. For example, in the test depicted below, the Cosmos Reason model is tasked with answering a multiple-choice question about the relative motion in the video:<\/p>\n<p style=\"text-align: center;\">Example from Cosmos Reason evaluation dataset<\/p>\n<p>What Does Reasoning Look Like for an AI Model?\u00a0<\/p>\n<p>To develop their reasoning capabilities, NVIDIA models are being taught physical common sense about the real world via <a href=\"https:\/\/blogs.nvidia.com\/blog\/supervised-unsupervised-learning\/\" rel=\"nofollow noopener\" target=\"_blank\">reinforcement learning<\/a>.<\/p>\n<p>For example, robots don\u2019t intuitively know which way is left, right, up or down. They\u2019re taught these spatial-temporal limitations through training. AI-powered robots used in safety testing, such as vehicle crash testing, must be taught to be aware of how their physical forms interact with their surroundings.<\/p>\n<p>Without embedding common sense into the training of these robots, issues can arise in deployment.<\/p>\n<p>\u201cWithout basic knowledge about the physical world, a robot may fall down or accidentally break something, causing danger to the surrounding people and environment,\u201d said Yin Cui, a Cosmos Reason research scientist at NVIDIA.<\/p>\n<p>Distilling human common sense about the physical world into models is how NVIDIA is bringing about the next generation of AI.<\/p>\n<p>Enter the NVIDIA data factory team: a group of global analysts who come from various backgrounds \u2014 including bioengineering, business and linguistics. They\u2019re working to develop, analyze and compile hundreds of thousands of data units that will be used to train <a target=\"_blank\" href=\"https:\/\/www.nvidia.com\/en-us\/glossary\/generative-ai\/\" rel=\"nofollow noopener\">generative AI<\/a> models on how to reason.<\/p>\n<p>The Data Curation Process<\/p>\n<p>One of the NVIDIA data factory team\u2019s projects focuses on the development of world foundation models for physical AI applications. These virtual environments create deep learning neural networks that are safer and more effective for training reasoning models, based on simulated domains.<\/p>\n<p>It all starts with an NVIDIA annotation group that creates question-and-answer pairs based on video data. These videos are all from the real world and can include any type of footage, whether depicting chickens walking around in their coop or cars driving on a rural road.<\/p>\n<p>For example, an annotator might ask about the video below: \u201cThe person uses which hand to cut the spaghetti?\u201d<\/p>\n<p style=\"text-align: center;\">Example from Cosmos Reason evaluation dataset<\/p>\n<p>The annotators then come up with four multiple choice answers labeled A, B, C and D. The model is fed the data and has to reason and choose the correct answer.<\/p>\n<p>\u201cWe\u2019re basically coming up with a test for the model,\u201d said Cui. \u201cAll of our questions are multiple choice, like what students would see on a school exam.\u201d<\/p>\n<p>These question-and-answer pairs are then quality checked by NVIDIA analysts, such as Michelle Li.<\/p>\n<p>Li has a background in public health and data analytics, which allows her to look at the broader purpose of the data she analyzes.<\/p>\n<p>\u201cFor physical AI, we have a specific goal of wanting to train models on understanding the physical world, which helps me think about the bigger picture when I\u2019m looking at the Q&amp;A pairs and the types of questions that are being presented,\u201d Li said. \u201cI ask myself, do the Q&amp;A pairs that I\u2019m looking at align with our objectives for the guidelines that we have for the project?\u201d<\/p>\n<p>After this, the data is reviewed by the data factory leads of the project, who make sure it\u2019s up to quality standards and ready to be sent to the Cosmos Reason research team. The scientists then feed the hundred thousands of data units \u2014 in this case the Q&amp;A pairs \u2014 to the model, training it with reinforcement learning on the bounds and limitations of the physical world.<\/p>\n<p>What Are the Applications of Reasoning AI?\u00a0<\/p>\n<p>Reasoning models are exceptional because they can make sense of their temporal space as well as predict outcomes. They can analyze a situation, come up with a thought web of probable outcomes and infer the most likely scenario.<\/p>\n<p>Simply put, reasoning AI demonstrates humanlike thinking. It shows its work, giving the user insight into the logic behind its responses.<\/p>\n<p>Users can ask these models to analyze a video such as of two cars driving on a road. When asked a question like, \u201cWhat would happen if the cars were driving toward each other on the same lane?\u201d the model can reason and determine the most probable outcome of the proposed scenario \u2014 for example, a car crash.<\/p>\n<p>\u201cWe\u2019re building a pioneering reasoning model focused on physical AI,\u201d said Tsung-Yi Lin, a principal research scientist on the Cosmos Reason team at NVIDIA.<\/p>\n<p>The data factory team\u2019s ability to produce high-quality data will be imperative for driving the development of intelligent autonomous agents and physical AI systems that can safely interact with the real world as NVIDIA reasoning model innovation continues.<\/p>\n<p>Preview <a target=\"_blank\" href=\"https:\/\/build.nvidia.com\/nvidia\/cosmos-reason1-7b\" rel=\"nofollow noopener\">NVDIA Cosmos-Reason1<\/a> or download the model on <a target=\"_blank\" href=\"https:\/\/huggingface.co\/nvidia\/Cosmos-Reason1-7B\" rel=\"nofollow noopener\">Hugging Face<\/a> and <a target=\"_blank\" href=\"https:\/\/github.com\/nvidia-cosmos\/cosmos-reason1\" rel=\"nofollow noopener\">GitHub<\/a>.<\/p>\n","protected":false},"excerpt":{"rendered":"AI models are advancing at a rapid rate and scale. But what might they lack that (most) humans&hellip;\n","protected":false},"author":2,"featured_media":115642,"comment_status":"","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[45],"tags":[182,181,507,74],"class_list":{"0":"post-115641","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-artificial-intelligence","8":"tag-ai","9":"tag-artificial-intelligence","10":"tag-artificialintelligence","11":"tag-technology"},"_links":{"self":[{"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/posts\/115641","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/comments?post=115641"}],"version-history":[{"count":0,"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/posts\/115641\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/media\/115642"}],"wp:attachment":[{"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/media?parent=115641"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/categories?post=115641"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/tags?post=115641"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}