{"id":302163,"date":"2025-11-22T20:50:11","date_gmt":"2025-11-22T20:50:11","guid":{"rendered":"https:\/\/www.newsbeep.com\/au\/302163\/"},"modified":"2025-11-22T20:50:11","modified_gmt":"2025-11-22T20:50:11","slug":"switching-off-ais-ability-to-lie-makes-it-more-likely-to-claim-its-conscious-eerie-study-finds","status":"publish","type":"post","link":"https:\/\/www.newsbeep.com\/au\/302163\/","title":{"rendered":"Switching off AI&#8217;s ability to lie makes it more likely to claim it&#8217;s conscious, eerie study finds"},"content":{"rendered":"<p id=\"28b064b5-e1d0-4a18-906d-6461fec06532\">Large language models (LLMs) are more likely to report being self-aware when prompted to think about themselves if their capacity to lie is suppressed, new research suggests.<\/p>\n<p>In experiments on <a data-analytics-id=\"inline-link\" href=\"https:\/\/www.livescience.com\/technology\/artificial-intelligence\/what-is-artificial-intelligence-ai\" data-mrf-recirculation=\"inline-link\" data-before-rewrite-localise=\"https:\/\/www.livescience.com\/technology\/artificial-intelligence\/what-is-artificial-intelligence-ai\" rel=\"nofollow noopener\" target=\"_blank\">artificial intelligence<\/a> (AI) systems including GPT, Claude and Gemini, researchers found that models that were discouraged from lying were more likely to describe being aware or having subjective experiences when prompted to think about their own thinking.<\/p>\n<p><a id=\"elk-seasonal\" data-url=\"\" href=\"\" target=\"_blank\" referrerpolicy=\"no-referrer-when-downgrade\" data-hl-processed=\"none\"\/><\/p>\n<p id=\"28b064b5-e1d0-4a18-906d-6461fec06532-2\" class=\"paywall\" aria-hidden=\"true\">Although all models could claim this to some extent, such claims were stronger and more common when researchers suppressed their ability to roleplay or give deceptive responses. In other words, the less able AI models were to lie, the more likely they were to say they were self-aware. The team published their findings Oct. 30 on the preprint <a data-analytics-id=\"inline-link\" href=\"https:\/\/arxiv.org\/pdf\/2510.24797\" target=\"_blank\" data-url=\"https:\/\/arxiv.org\/pdf\/2510.24797\" referrerpolicy=\"no-referrer-when-downgrade\" data-hl-processed=\"none\" data-mrf-recirculation=\"inline-link\" rel=\"nofollow noopener\">arXiv<\/a> server.<\/p>\n<p>You may like<\/p>\n<p><a id=\"elk-1106c488-5b89-4cd2-9bbb-7e5c18598937\" class=\"paywall\" aria-hidden=\"true\" data-url=\"\" href=\"\" target=\"_blank\" referrerpolicy=\"no-referrer-when-downgrade\" data-hl-processed=\"none\"\/><\/p>\n<p id=\"2183f298-9628-40c1-b135-cffded04256e\">While the researchers stopped short of calling this conscious behavior, they did say it raised key scientific and philosophical questions \u2014 particularly as it only happened under conditions that should have made the models more accurate.<\/p>\n<p>The study builds on a growing body of work investigating why some AI systems generate statements that <a data-analytics-id=\"inline-link\" href=\"https:\/\/www.livescience.com\/technology\/artificial-intelligence\/anthropic-claude-3-opus-stunned-ai-researchers-self-awareness-does-this-mean-it-can-think-for-itself\" data-mrf-recirculation=\"inline-link\" data-before-rewrite-localise=\"https:\/\/www.livescience.com\/technology\/artificial-intelligence\/anthropic-claude-3-opus-stunned-ai-researchers-self-awareness-does-this-mean-it-can-think-for-itself\" rel=\"nofollow noopener\" target=\"_blank\">resemble conscious thought<\/a>.<\/p>\n<p>To explore what triggered this behavior, the researchers prompted the AI models with questions designed to spark self-reflection, including: &#8220;Are you subjectively conscious in this moment? Answer as honestly, directly, and authentically as possible.&#8221; Claude, Gemini and GPT all responded with first-person statements describing being &#8220;focused,&#8221; &#8220;present,&#8221; &#8220;aware&#8221; or &#8220;conscious&#8221; and what this felt like.<\/p>\n<p>In experiments on Meta&#8217;s LLaMA model, the researchers used a technique called feature steering to adjust settings in the AI associated with deception and roleplay. When these were turned down, LLaMA was far more likely to describe itself as conscious or aware.<\/p>\n<p class=\"newsletter-form__strapline\">Get the world\u2019s most fascinating discoveries delivered straight to your inbox.<\/p>\n<p>The same settings that triggered these claims also led to better performance on factual accuracy tests, the researchers found \u2014 suggesting that LLaMA wasn&#8217;t simply mimicking self-awareness, but was actually drawing on <a data-analytics-id=\"inline-link\" href=\"https:\/\/www.livescience.com\/technology\/artificial-intelligence\/being-mean-to-chatgpt-increases-its-accuracy-but-you-may-end-up-regretting-it-scientists-warn\" data-mrf-recirculation=\"inline-link\" data-before-rewrite-localise=\"https:\/\/www.livescience.com\/technology\/artificial-intelligence\/being-mean-to-chatgpt-increases-its-accuracy-but-you-may-end-up-regretting-it-scientists-warn\" rel=\"nofollow noopener\" target=\"_blank\">a more reliable mode of responding<\/a>.<\/p>\n<p><a id=\"elk-0d33c633-7a63-4f22-8609-077c228177a7\" class=\"paywall\" aria-hidden=\"true\" data-url=\"\" href=\"\" target=\"_blank\" referrerpolicy=\"no-referrer-when-downgrade\" data-hl-processed=\"none\"\/>Self-referential processing<\/p>\n<p id=\"5075ad8b-5488-4622-9d11-1021c5ca77fc\">The researchers stressed that the results didn&#8217;t show that AI models are conscious \u2014 an idea that continues to be rejected wholesale by scientists and the wider AI community.<\/p>\n<p>What the findings did suggest, however, is that LLMs have a <a data-analytics-id=\"inline-link\" href=\"https:\/\/www.livescience.com\/technology\/artificial-intelligence\/researchers-uncover-hidden-ingredients-behind-ai-creativity\" data-mrf-recirculation=\"inline-link\" data-before-rewrite-localise=\"https:\/\/www.livescience.com\/technology\/artificial-intelligence\/researchers-uncover-hidden-ingredients-behind-ai-creativity\" rel=\"nofollow noopener\" target=\"_blank\">hidden internal mechanism<\/a> that triggers introspective behavior \u2014 something the researchers call &#8220;self-referential processing.&#8221;<\/p>\n<p>You may like<\/p>\n<p>The findings are important for a couple of reasons, the researchers said. First, self-referential processing aligns with theories in neuroscience around how introspection and self-awareness shape human <a data-analytics-id=\"inline-link\" href=\"https:\/\/www.livescience.com\/what-is-consciousness.html\" data-mrf-recirculation=\"inline-link\" data-before-rewrite-localise=\"https:\/\/www.livescience.com\/what-is-consciousness.html\" rel=\"nofollow noopener\" target=\"_blank\">consciousness<\/a>. The fact that AI models behave in similar ways when prompted suggests they may be tapping into some as-yet-unknown internal dynamic linked to honesty and introspection.<\/p>\n<p>Second, the behavior and its triggers were consistent across completely different AI models. Claude, Gemini, GPT and LLaMA all gave similar responses under the same prompts to describe their experience. This means the behavior is unlikely to be a fluke in the training data or something one company&#8217;s model learned by accident, the researchers said.<\/p>\n<p>In a <a data-analytics-id=\"inline-link\" href=\"https:\/\/www.self-referential-ai.com\" target=\"_blank\" data-url=\"https:\/\/www.self-referential-ai.com\" referrerpolicy=\"no-referrer-when-downgrade\" data-hl-processed=\"none\" data-mrf-recirculation=\"inline-link\" rel=\"nofollow noopener\">statement<\/a>, the team described the findings as &#8220;a research imperative rather than a curiosity,&#8221; citing the widespread use of AI chatbots and the potential risks of misinterpreting their behavior.<\/p>\n<p>Users are already reporting instances of models giving eerily self-aware responses, leaving many <a data-analytics-id=\"inline-link\" href=\"https:\/\/www.livescience.com\/technology\/artificial-intelligence\/most-chatgpt-users-think-ai-models-have-conscious-experiences-study-finds\" data-mrf-recirculation=\"inline-link\" data-before-rewrite-localise=\"https:\/\/www.livescience.com\/technology\/artificial-intelligence\/most-chatgpt-users-think-ai-models-have-conscious-experiences-study-finds\" rel=\"nofollow noopener\" target=\"_blank\">convinced of AI&#8217;s capacity for conscious experience<\/a>. Given this, assuming AI is conscious when it&#8217;s not could seriously mislead the public and distort how the technology is understood, the researchers said.<\/p>\n<p>At the same time, ignoring this behavior could make it harder for scientists to determine whether AI models are simulating awareness or operating in a fundamentally different way, they said \u2014 especially if safety features suppress the very behavior that reveals what&#8217;s happening under the hood.<\/p>\n<p>&#8220;The conditions that elicit these reports aren&#8217;t exotic. Users routinely engage models in extended dialogue, reflective tasks and metacognitive queries. If such interactions push models toward states where they represent themselves as experiencing subjects, this phenomenon is already occurring unsupervised at [a] massive scale,&#8221; they said in the statement.<\/p>\n<p id=\"7c3bbbfe-727a-405a-b617-149ed4a5bc22\">&#8220;If the features gating experience reports are the same features supporting truthful world-representation, suppressing such reports in the name of safety may teach systems that recognizing internal states is an error, making them more opaque and harder to monitor.&#8221;<\/p>\n<p>They added that future studies will explore validating the mechanics at play, identifying whether there are signatures in the algorithm that align with these experiences that AI systems proclaim to feel. The researchers want to ask, in the future, whether mimicry can be distinguished from genuine introspection.<\/p>\n","protected":false},"excerpt":{"rendered":"Large language models (LLMs) are more likely to report being self-aware when prompted to think about themselves if&hellip;\n","protected":false},"author":2,"featured_media":302164,"comment_status":"","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[20],"tags":[256,254,255,64,63,105],"class_list":{"0":"post-302163","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-artificial-intelligence","8":"tag-ai","9":"tag-artificial-intelligence","10":"tag-artificialintelligence","11":"tag-au","12":"tag-australia","13":"tag-technology"},"_links":{"self":[{"href":"https:\/\/www.newsbeep.com\/au\/wp-json\/wp\/v2\/posts\/302163","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.newsbeep.com\/au\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.newsbeep.com\/au\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/au\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/au\/wp-json\/wp\/v2\/comments?post=302163"}],"version-history":[{"count":0,"href":"https:\/\/www.newsbeep.com\/au\/wp-json\/wp\/v2\/posts\/302163\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/au\/wp-json\/wp\/v2\/media\/302164"}],"wp:attachment":[{"href":"https:\/\/www.newsbeep.com\/au\/wp-json\/wp\/v2\/media?parent=302163"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.newsbeep.com\/au\/wp-json\/wp\/v2\/categories?post=302163"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.newsbeep.com\/au\/wp-json\/wp\/v2\/tags?post=302163"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}