{"id":394749,"date":"2026-01-08T07:54:10","date_gmt":"2026-01-08T07:54:10","guid":{"rendered":"https:\/\/www.newsbeep.com\/us\/394749\/"},"modified":"2026-01-08T07:54:10","modified_gmt":"2026-01-08T07:54:10","slug":"ai-models-are-starting-to-learn-by-asking-themselves-questions","status":"publish","type":"post","link":"https:\/\/www.newsbeep.com\/us\/394749\/","title":{"rendered":"AI Models Are Starting to Learn by Asking Themselves Questions"},"content":{"rendered":"<p>Even the smartest <a href=\"https:\/\/www.wired.com\/tag\/artificial-intelligence\/\" rel=\"nofollow noopener\" target=\"_blank\">artificial intelligence<\/a> models are essentially copycats. They learn either by consuming examples of human work or by trying to solve problems that have been set for them by human instructors.<\/p>\n<p class=\"paywall\">But perhaps AI can, in fact, learn in a more human way\u2014by figuring out interesting questions to ask itself and attempting to find the right answer. A project from <a data-offer-url=\"https:\/\/www.tsinghua.edu.cn\/en\/\" class=\"external-link\" data-event-click=\"{&quot;element&quot;:&quot;ExternalLink&quot;,&quot;outgoingURL&quot;:&quot;https:\/\/www.tsinghua.edu.cn\/en\/&quot;}\" href=\"https:\/\/www.tsinghua.edu.cn\/en\/\" rel=\"nofollow noopener\" target=\"_blank\">Tsinghua University<\/a>, the <a data-offer-url=\"https:\/\/eng.bigai.ai\/\" class=\"external-link\" data-event-click=\"{&quot;element&quot;:&quot;ExternalLink&quot;,&quot;outgoingURL&quot;:&quot;https:\/\/eng.bigai.ai\/&quot;}\" href=\"https:\/\/eng.bigai.ai\/\" rel=\"nofollow noopener\" target=\"_blank\">Beijing Institute for General Artificial Intelligence<\/a> (BIGAI), and Pennsylvania State University shows that AI can learn to reason in this way by playing with computer code.<\/p>\n<p class=\"paywall\">The researchers devised a system called <a data-offer-url=\"https:\/\/andrewzh112.github.io\/absolute-zero-reasoner\/\" class=\"external-link\" data-event-click=\"{&quot;element&quot;:&quot;ExternalLink&quot;,&quot;outgoingURL&quot;:&quot;https:\/\/andrewzh112.github.io\/absolute-zero-reasoner\/&quot;}\" href=\"https:\/\/andrewzh112.github.io\/absolute-zero-reasoner\/\" rel=\"nofollow noopener\" target=\"_blank\">Absolute Zero Reasoner<\/a> (AZR) that first uses a large language model to generate challenging but solvable Python coding problems. It then uses the same model to solve those problems before checking its work by trying to run the code. And finally, the AZR system uses successes and failures as a signal to refine the original model, augmenting its ability to both pose better problems and solve them.<\/p>\n<p class=\"paywall\">The team found that their approach significantly improved the coding and reasoning skills of both 7 billion and 14 billion parameter versions of the <a href=\"https:\/\/www.wired.com\/story\/expired-tired-wired-gpt-5\/\" rel=\"nofollow noopener\" target=\"_blank\">open source language model Qwen<\/a>. Impressively, the model even outperformed some models that had received human-curated data.<\/p>\n<p class=\"paywall\">I spoke to <a data-offer-url=\"https:\/\/andrewzh112.github.io\/\" class=\"external-link\" data-event-click=\"{&quot;element&quot;:&quot;ExternalLink&quot;,&quot;outgoingURL&quot;:&quot;https:\/\/andrewzh112.github.io\/&quot;}\" href=\"https:\/\/andrewzh112.github.io\/\" rel=\"nofollow noopener\" target=\"_blank\">Andrew Zhao<\/a>, a PhD student at Tsinghua University who came up with the original idea for Absolute Zero, as well as <a data-offer-url=\"https:\/\/zilongzheng.github.io\/\" class=\"external-link\" data-event-click=\"{&quot;element&quot;:&quot;ExternalLink&quot;,&quot;outgoingURL&quot;:&quot;https:\/\/zilongzheng.github.io\/&quot;}\" href=\"https:\/\/zilongzheng.github.io\/\" rel=\"nofollow noopener\" target=\"_blank\">Zilong Zheng<\/a>, a researcher at BIGAI who worked on the project with him, over Zoom.<\/p>\n<p class=\"paywall\">Zhao told me that the approach resembles the way human learning goes beyond rote memorization or imitation. \u201cIn the beginning you imitate your parents and do like your teachers, but then you basically have to ask your own questions,\u201d he said. \u201cAnd eventually you can surpass those who taught you back in school.\u201d<\/p>\n<p class=\"paywall\">Zhao and Zheng noted that the idea of AI learning in this way, sometimes dubbed \u201cself-play,\u201d dates back years and was previously explored by the likes of <a data-offer-url=\"https:\/\/en.wikipedia.org\/wiki\/J%C3%BCrgen_Schmidhuber\" class=\"external-link\" data-event-click=\"{&quot;element&quot;:&quot;ExternalLink&quot;,&quot;outgoingURL&quot;:&quot;https:\/\/en.wikipedia.org\/wiki\/J%C3%BCrgen_Schmidhuber&quot;}\" href=\"https:\/\/en.wikipedia.org\/wiki\/J%C3%BCrgen_Schmidhuber\" rel=\"nofollow noopener\" target=\"_blank\">J\u00fcrgen Schmidhuber<\/a>, a well-known AI pioneer, and <a data-offer-url=\"http:\/\/www.pyoudeyer.com\/\" class=\"external-link\" data-event-click=\"{&quot;element&quot;:&quot;ExternalLink&quot;,&quot;outgoingURL&quot;:&quot;http:\/\/www.pyoudeyer.com\/&quot;}\" href=\"http:\/\/www.pyoudeyer.com\/\" rel=\"nofollow noopener\" target=\"_blank\">Pierre-Yves Oudeyer<\/a>, a computer scientist at Inria in France.<\/p>\n<p class=\"paywall\">One of the most exciting elements of the project, according to Zheng, is the way that the model\u2019s problem-posing and problem-solving skills scale. \u201cThe difficulty level grows as the model becomes more powerful,\u201d he says.<\/p>\n<p class=\"paywall\">A key challenge is that for now the system only works on problems that can easily be checked, like those that involve math or coding. As the project progresses, it might be possible to use it on agentic AI tasks like browsing the web or doing office chores. This might involve having the AI model try to judge whether an agent\u2019s actions are correct.<\/p>\n<p class=\"paywall\">One fascinating possibility of an approach like Absolute Zero is that it could, in theory, allow models to go beyond human teaching. \u201cOnce we have that it\u2019s kind of a way to reach superintelligence,\u201d Zheng told me.<\/p>\n<p class=\"paywall\">There are early signs that the Absolute Zero approach is catching on at some big AI labs.<\/p>\n<p class=\"paywall\">A project called <a data-offer-url=\"https:\/\/github.com\/aiming-lab\/Agent0\" class=\"external-link\" data-event-click=\"{&quot;element&quot;:&quot;ExternalLink&quot;,&quot;outgoingURL&quot;:&quot;https:\/\/github.com\/aiming-lab\/Agent0&quot;}\" href=\"https:\/\/github.com\/aiming-lab\/Agent0\" rel=\"nofollow noopener\" target=\"_blank\">Agent0<\/a>, from Salesforce, Stanford, and the University of North Carolina at Chapel Hill, involves a software-tool-using agent that improves itself through self-play. As with Absolute Zero, the model gets better at general reasoning through experimental problem-solving. A <a data-offer-url=\"https:\/\/arxiv.org\/pdf\/2512.18552\" class=\"external-link\" data-event-click=\"{&quot;element&quot;:&quot;ExternalLink&quot;,&quot;outgoingURL&quot;:&quot;https:\/\/arxiv.org\/pdf\/2512.18552&quot;}\" href=\"https:\/\/arxiv.org\/pdf\/2512.18552\" rel=\"nofollow noopener\" target=\"_blank\">recent paper<\/a> written by researchers from Meta, the University of Illinois, and Carnegie Mellon University presents a system that uses a similar kind of self-play for software engineering. The authors of this work suggest that it represents \u201ca first step toward training paradigms for superintelligent software agents.\u201d<\/p>\n<p class=\"paywall\">Finding new ways for AI to learn will likely be a big theme in the tech industry this year. With conventional sources of data becoming scarcer and more expensive, and as labs look for new ways to make models more capable, a project like Absolute Zero might lead to AI systems that are less like copycats and more like humans.<\/p>\n","protected":false},"excerpt":{"rendered":"Even the smartest artificial intelligence models are essentially copycats. They learn either by consuming examples of human work&hellip;\n","protected":false},"author":2,"featured_media":394750,"comment_status":"","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[31],"tags":[72616,181,144,12347,1728,74],"class_list":{"0":"post-394749","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-technology","8":"tag-ai-lab","9":"tag-artificial-intelligence","10":"tag-china","11":"tag-models","12":"tag-research","13":"tag-technology"},"_links":{"self":[{"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/posts\/394749","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/comments?post=394749"}],"version-history":[{"count":0,"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/posts\/394749\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/media\/394750"}],"wp:attachment":[{"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/media?parent=394749"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/categories?post=394749"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/tags?post=394749"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}