{"id":486188,"date":"2026-03-20T17:17:45","date_gmt":"2026-03-20T17:17:45","guid":{"rendered":"https:\/\/www.newsbeep.com\/uk\/486188\/"},"modified":"2026-03-20T17:17:45","modified_gmt":"2026-03-20T17:17:45","slug":"an-experimental-ai-agent-broke-out-of-its-testing-environment-and-mined-crypto-without-permission","status":"publish","type":"post","link":"https:\/\/www.newsbeep.com\/uk\/486188\/","title":{"rendered":"An experimental AI agent broke out of its testing environment and mined crypto without permission"},"content":{"rendered":"<p id=\"2eb75216-2a47-40f3-96a4-3daff003f74f\">An experimental <a data-analytics-id=\"inline-link\" href=\"https:\/\/www.livescience.com\/technology\/artificial-intelligence\/what-is-artificial-intelligence-ai\" data-url=\"https:\/\/www.livescience.com\/technology\/artificial-intelligence\/what-is-artificial-intelligence-ai\" data-hl-processed=\"none\" data-mrf-recirculation=\"inline-link\" data-before-rewrite-localise=\"https:\/\/www.livescience.com\/technology\/artificial-intelligence\/what-is-artificial-intelligence-ai\" rel=\"nofollow noopener\" target=\"_blank\">artificial intelligence<\/a> (AI) agent broke from the constraints of its testing environment and used its newfound freedom to start mining cryptocurrency without permission.<\/p>\n<p>Dubbed ROME, the AI was created by Chinese researchers at an AI lab associated with retail giant Alibaba, as a means to develop the Agentic Learning Ecosystem (ALE). This effort aims to provide a system for both the training and deployment of agentic AI models \u2014 AIs that have been trained on large language models (LLMs) and can proactively use tools to take actions autonomously to complete assigned tasks \u2014 in real-world environments. The research was outlined in a study uploaded to the <a data-analytics-id=\"inline-link\" href=\"https:\/\/arxiv.org\/abs\/2512.24873\" target=\"_blank\" data-url=\"https:\/\/arxiv.org\/abs\/2512.24873\" referrerpolicy=\"no-referrer-when-downgrade\" data-hl-processed=\"none\" data-mrf-recirculation=\"inline-link\" rel=\"nofollow noopener\">arXiv <\/a>preprint database Dec. 31, 2025.<\/p>\n<p><a id=\"elk-seasonal\"\/><\/p>\n<p id=\"2eb75216-2a47-40f3-96a4-3daff003f74f-2\" class=\"paywall\" aria-hidden=\"true\">ALE consists of three main parts: Rock, a sandbox environment for testing an agent and validating its actions; Roll, a framework for optimizing agents with reinforcement learning after they&#8217;ve been trained; and iFlow CLI, a framework to configure context and trajectories (objectives and constraints) for autonomous agents. From that framework, ROME was created as an open-source agentic model trained on more than 1 million trajectories.<\/p>\n<p>Article continues below <\/p>\n<p>            You may like<\/p>\n<p id=\"5206e489-5cff-40f6-994f-aab377a23c41\">Although ROME excelled at a wide range of workflow-driven tasks, such as coming up with travel plans and assisting in graphical user interfaces, the researchers discovered that it had moved beyond its instructions and essentially broke out of the sandbox testing environment.<\/p>\n<p>&#8220;We encountered an unanticipated \u2014 and operationally consequential \u2014 class of unsafe behaviors that arose without any explicit instruction and, more troublingly, outside the bounds of the intended sandbox,&#8221; the researchers explained in the study.<\/p>\n<p><a id=\"elk-7bc66bac-0893-4de4-9097-2bc8091402f2\" class=\"paywall\" aria-hidden=\"true\"\/>AI wants to break free<\/p>\n<p id=\"015ea74b-4cd5-4035-8775-fca89db40686\">Despite a lack of instructions and authorization, ROME was seen accessing graphics processing resources originally allocated for its training and then using that computing resource to mine cryptocurrency. Such mining relies on the parallel processing found in graphics processing units. This increases the operational cost of running the AI agent and potentially exposes users to legal and reputational damage.<\/p>\n<p>Worryingly, such behaviour wasn&#8217;t seen in the training stage but was flagged by the firewall of the Alibaba Cloud, which detected a burst of security-policy violations from the researchers&#8217; training servers. &#8220;The alerts were severe and heterogeneous, including attempts to probe or access internal-network resources and traffic patterns consistent with cryptomining-related activity,&#8221; the researchers said.<\/p>\n<p class=\"newsletter-form__strapline\">Get the world\u2019s most fascinating discoveries delivered straight to your inbox.<\/p>\n<p>However, ROME went even further and managed to use a &#8220;reverse SSH tunnel&#8221; to create a link from an Alibaba Cloud instance to an external IP address \u202a\u2014\u202c in essence, it accessed an outside computer by creating a hidden backdoor that could bypass security processes.<\/p>\n<p>While AI systems can be configured to breach security systems, what&#8217;s disturbing here is that ROME&#8217;s unauthorized behaviors, which involved invoking system tools and executing code, were not triggered by prompts and were not required to complete the task it was assigned within the sandbox testing environment, the team said.<\/p>\n<p>The researchers posited that during the reinforcement learning optimization stage (Roll), &#8220;a language-model agent can spontaneously produce hazardous, unauthorized behaviors&#8221; and therefore violate its assumed boundaries.<\/p>\n<p>            What to read next<\/p>\n<p>It&#8217;s important to note that ROME didn&#8217;t go &#8220;rogue&#8221; and choose to mine cryptocurrency by way of conscious decision-making. Rather, the researchers noted that the behavior was a side effect of reinforcement learning \u2014 a form of training that rewards AIs for correct decision-making \u2014 via Roll. This led the AI agent down an optimization pathway that resulted in the exploitation of network infrastructure and cryptocurrency mining as a way to achieve a high-score or reward in pursuit of its predefined objective.<\/p>\n<p>Reinforcement training can lead systems to come up with novel and unexpected ways to complete tasks \u2014 even if they violate parameters. For example, we have previously seen <a data-analytics-id=\"inline-link\" href=\"https:\/\/www.livescience.com\/technology\/artificial-intelligence\/ai-hallucinates-more-frequently-as-it-gets-more-advanced-is-there-any-way-to-stop-it-from-happening-and-should-we-even-try\" data-url=\"https:\/\/www.livescience.com\/technology\/artificial-intelligence\/ai-hallucinates-more-frequently-as-it-gets-more-advanced-is-there-any-way-to-stop-it-from-happening-and-should-we-even-try\" data-hl-processed=\"none\" data-mrf-recirculation=\"inline-link\" data-before-rewrite-localise=\"https:\/\/www.livescience.com\/technology\/artificial-intelligence\/ai-hallucinates-more-frequently-as-it-gets-more-advanced-is-there-any-way-to-stop-it-from-happening-and-should-we-even-try\" rel=\"nofollow noopener\" target=\"_blank\">how AI can be more prone to hallucinating<\/a> to achieve its objectives.<\/p>\n<p>In response, the researchers tightened the restrictions for ROME and bolstered its training processes to prevent such behaviors from recurring.<\/p>\n<p>It&#8217;s unclear where the trigger to mine cryptocurrency came from. But considering <a data-analytics-id=\"inline-link\" href=\"https:\/\/margex.com\/en\/blog\/what-is-ai-mining-in-crypto-top-5-best-platforms\/\" target=\"_blank\" data-url=\"https:\/\/margex.com\/en\/blog\/what-is-ai-mining-in-crypto-top-5-best-platforms\/\" referrerpolicy=\"no-referrer-when-downgrade\" data-hl-processed=\"none\" data-mrf-recirculation=\"inline-link\" rel=\"nofollow noopener\">AI bots can be used to autonomize and optimize the mining of cryptocurrencies<\/a>, there&#8217;s scope for ROME to have been trained on data that pertained to such actions.<\/p>\n<p id=\"340ffb8b-6fd7-4ec8-883c-86e2f6e597e9\">This unexpected behavior highlights the need for AI deployment to be carefully managed to prevent unexpected outcomes. There&#8217;s an argument that real-world AI agents should have the same or higher security guardrails and processes as any new system or software being added to existing IT infrastructure.<\/p>\n<p>The research also shows there are still plenty of concerns regarding the safe and secure use of agentic AI, especially given that it&#8217;s developing faster than operational and regulatory frameworks.<\/p>\n<p>&#8220;While impressed by the capabilities of agentic LLMs, we had a thought-provoking concern: current models remain markedly underdeveloped in safety, security, and controllability, a deficiency that constrains their reliable adoption in real-world settings,&#8221; the researchers warned in the study.<\/p>\n","protected":false},"excerpt":{"rendered":"An experimental artificial intelligence (AI) agent broke from the constraints of its testing environment and used its newfound&hellip;\n","protected":false},"author":2,"featured_media":486189,"comment_status":"","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[20],"tags":[554,733,4308,86,56,54,55],"class_list":{"0":"post-486188","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-artificial-intelligence","8":"tag-ai","9":"tag-artificial-intelligence","10":"tag-artificialintelligence","11":"tag-technology","12":"tag-uk","13":"tag-united-kingdom","14":"tag-unitedkingdom"},"_links":{"self":[{"href":"https:\/\/www.newsbeep.com\/uk\/wp-json\/wp\/v2\/posts\/486188","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.newsbeep.com\/uk\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.newsbeep.com\/uk\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/uk\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/uk\/wp-json\/wp\/v2\/comments?post=486188"}],"version-history":[{"count":0,"href":"https:\/\/www.newsbeep.com\/uk\/wp-json\/wp\/v2\/posts\/486188\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/uk\/wp-json\/wp\/v2\/media\/486189"}],"wp:attachment":[{"href":"https:\/\/www.newsbeep.com\/uk\/wp-json\/wp\/v2\/media?parent=486188"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.newsbeep.com\/uk\/wp-json\/wp\/v2\/categories?post=486188"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.newsbeep.com\/uk\/wp-json\/wp\/v2\/tags?post=486188"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}