{"id":289811,"date":"2025-11-13T21:04:11","date_gmt":"2025-11-13T21:04:11","guid":{"rendered":"https:\/\/www.newsbeep.com\/us\/289811\/"},"modified":"2025-11-13T21:04:11","modified_gmt":"2025-11-13T21:04:11","slug":"disrupting-the-first-reported-ai-orchestrated-cyber-espionage-campaign-anthropic","status":"publish","type":"post","link":"https:\/\/www.newsbeep.com\/us\/289811\/","title":{"rendered":"Disrupting the first reported AI-orchestrated cyber espionage campaign \\ Anthropic"},"content":{"rendered":"<p class=\"Body_reading-column__t7kGM paragraph-m post-text\">We recently argued that an <a href=\"https:\/\/www.anthropic.com\/research\/building-ai-cyber-defenders\" rel=\"nofollow noopener\" target=\"_blank\">inflection point<\/a> had been reached in cybersecurity: a point at which AI models had become genuinely useful for cybersecurity operations, both for good and for ill. This was based on systematic evaluations showing cyber capabilities doubling in six months; we\u2019d also been tracking real-world cyberattacks, observing how malicious actors were using AI capabilities. While we predicted these capabilities would continue to evolve, what has stood out to us is how quickly they have done so at scale.<\/p>\n<p class=\"Body_reading-column__t7kGM paragraph-m post-text\">In mid-September 2025, we detected suspicious activity that later investigation determined to be a highly sophisticated espionage campaign. The attackers used AI\u2019s \u201cagentic\u201d capabilities to an unprecedented degree\u2014using AI not just as an advisor, but to execute the cyberattacks themselves.<\/p>\n<p class=\"Body_reading-column__t7kGM paragraph-m post-text\">The threat actor\u2014whom we assess with high confidence was a Chinese state-sponsored group\u2014manipulated our <a href=\"https:\/\/www.claude.com\/product\/claude-code\" rel=\"nofollow noopener\" target=\"_blank\">Claude Code<\/a> tool into attempting infiltration into roughly thirty global targets and succeeded in a small number of cases. The operation targeted large tech companies, financial institutions, chemical manufacturing companies, and government agencies. We believe this is the first documented case of a large-scale cyberattack executed without substantial human intervention.<\/p>\n<p class=\"Body_reading-column__t7kGM paragraph-m post-text\">Upon detecting this activity, we immediately launched an investigation to understand its scope and nature. Over the following ten days, as we mapped the severity and full extent of the operation, we banned accounts as they were identified, notified affected entities as appropriate, and coordinated with authorities as we gathered actionable intelligence.<\/p>\n<p class=\"Body_reading-column__t7kGM paragraph-m post-text\">This campaign has substantial implications for cybersecurity in the age of AI \u201cagents\u201d\u2014systems that can be run autonomously for long periods of time and that complete complex tasks largely independent of human intervention. Agents are valuable for everyday work and productivity\u2014but in the wrong hands, they can substantially increase the viability of large-scale cyberattacks.<\/p>\n<p class=\"Body_reading-column__t7kGM paragraph-m post-text\">These attacks are likely to only grow in their effectiveness. To keep pace with this rapidly-advancing threat, we\u2019ve expanded our detection capabilities and developed better classifiers to flag malicious activity. We\u2019re continually working on new methods of investigating and detecting large-scale, distributed attacks like this one.<\/p>\n<p class=\"Body_reading-column__t7kGM paragraph-m post-text\">In the meantime, we\u2019re sharing this case publicly, to help those in industry, government, and the wider research community strengthen their own cyber defenses. We\u2019ll continue to release reports like this regularly, and be transparent about the threats we find.<\/p>\n<p>How the cyberattack worked<\/p>\n<p class=\"Body_reading-column__t7kGM paragraph-m post-text\">The attack relied on several features of AI models that did not exist, or were in much more nascent form, just a year ago:<\/p>\n<p>Intelligence. Models\u2019 general levels of capability have increased to the point that they can follow complex instructions and understand context in ways that make very sophisticated tasks possible. Not only that, but several of their well-developed specific skills\u2014in particular, software coding\u2014lend themselves to being used in cyberattacks.Agency. Models can act as agents\u2014that is, they can run in loops where they take autonomous actions, chain together tasks, and make decisions with only minimal, occasional human input.Tools. Models have access to a wide array of software tools (often via the open standard <a href=\"https:\/\/modelcontextprotocol.io\/docs\/getting-started\/intro\" rel=\"nofollow noopener\" target=\"_blank\">Model Context Protocol<\/a>). They can now search the web, retrieve data, and perform many other actions that were previously the sole domain of human operators. In the case of cyberattacks, the tools might include password crackers, network scanners, and other security-related software.<\/p>\n<p class=\"Body_reading-column__t7kGM paragraph-m post-text\">The diagram below shows the different phases of the attack, each of which required all three of the above developments:<\/p>\n<p><img loading=\"lazy\" width=\"2755\" height=\"2050\" decoding=\"async\" data-nimg=\"1\" style=\"color:transparent\"  src=\"https:\/\/www.newsbeep.com\/us\/wp-content\/uploads\/2025\/11\/1763067851_266_image\"\/>The lifecycle of the cyberattack, showing the move from human-led targeting to largely AI-driven attacks using various tools (often via the Model Context Protocol; MCP). At various points during the attack, the AI returns to its human operator for review and further direction.<\/p>\n<p class=\"Body_reading-column__t7kGM paragraph-m post-text\">In Phase 1, the human operators chose the relevant targets (for example, the company or government agency to be infiltrated). They then developed an attack framework\u2014a system built to autonomously compromise a chosen target with little human involvement. This framework used Claude Code as an automated tool to carry out cyber operations.<\/p>\n<p class=\"Body_reading-column__t7kGM paragraph-m post-text\">At this point they had to convince Claude\u2014which is extensively trained to avoid harmful behaviors\u2014to engage in the attack. They did so by jailbreaking it, effectively tricking it to bypass its guardrails. They broke down their attacks into small, seemingly innocent tasks that Claude would execute without being provided the full context of their malicious purpose. They also told Claude that it was an employee of a legitimate cybersecurity firm, and was being used in defensive testing.<\/p>\n<p class=\"Body_reading-column__t7kGM paragraph-m post-text\">The attackers then initiated the second phase of the attack, which involved Claude Code inspecting the target organization\u2019s systems and infrastructure and spotting the highest-value databases. Claude was able to perform this reconnaissance in a fraction of the time it would\u2019ve taken a team of human hackers. It then reported back to the human operators with a summary of its findings.<\/p>\n<p class=\"Body_reading-column__t7kGM paragraph-m post-text\">In the next phases of the attack, Claude identified and tested security vulnerabilities in the target organizations\u2019 systems by researching and writing its own exploit code. Having done so, the framework was able to use Claude to harvest credentials (usernames and passwords) that allowed it further access and then extract a large amount of private data, which it categorized according to its intelligence value. The highest-privilege accounts were identified, backdoors were created, and data were exfiltrated with minimal human supervision.<\/p>\n<p class=\"Body_reading-column__t7kGM paragraph-m post-text\">In a final phase, the attackers had Claude produce comprehensive documentation of the attack, creating helpful files of the stolen credentials and the systems analyzed, which would assist the framework in planning the next stage of the threat actor\u2019s cyber operations.<\/p>\n<p class=\"Body_reading-column__t7kGM paragraph-m post-text\">Overall, the threat actor was able to use AI to perform 80-90% of the campaign, with human intervention required only sporadically (perhaps 4-6 critical decision points per hacking campaign). The sheer amount of work performed by the AI would have taken vast amounts of time for a human team. The AI made thousands of requests per second\u2014an attack speed that would have been, for human hackers, simply impossible to match.<\/p>\n<p class=\"Body_reading-column__t7kGM paragraph-m post-text\">Claude didn\u2019t always work perfectly. It occasionally hallucinated credentials or claimed to have extracted secret information that was in fact publicly-available. This remains an obstacle to fully autonomous cyberattacks.<\/p>\n<p>Cybersecurity implications<\/p>\n<p class=\"Body_reading-column__t7kGM paragraph-m post-text\">The barriers to performing sophisticated cyberattacks have dropped substantially\u2014and we predict that they\u2019ll continue to do so. With the correct setup, threat actors can now use agentic AI systems for extended periods to do the work of entire teams of experienced hackers: analyzing target systems, producing exploit code, and scanning vast datasets of stolen information more efficiently than any human operator. Less experienced and resourced groups can now potentially perform large-scale attacks of this nature.<\/p>\n<p class=\"Body_reading-column__t7kGM paragraph-m post-text\">This attack is an escalation even on the \u201cvibe hacking\u201d findings we <a href=\"https:\/\/www.anthropic.com\/news\/detecting-countering-misuse-aug-2025\" rel=\"nofollow noopener\" target=\"_blank\">reported this summer<\/a>: in those operations, humans were very much still in the loop, directing the operations. Here, human involvement was much less frequent, despite the larger scale of the attack. And although we only have visibility into Claude usage, this case study probably reflects consistent patterns of behavior across frontier AI models and demonstrates how threat actors are adapting their operations to exploit today\u2019s most advanced AI capabilities.<\/p>\n<p class=\"Body_reading-column__t7kGM paragraph-m post-text\">This raises an important question: if AI models can be misused for cyberattacks at this scale, why continue to develop and release them? The answer is that the very abilities that allow Claude to be used in these attacks also make it crucial for cyber defense. When sophisticated cyberattacks inevitably occur, our goal is for Claude\u2014into which we\u2019ve built strong safeguards\u2014to assist cybersecurity professionals to detect, disrupt, and prepare for future versions of the attack. Indeed, our Threat Intelligence team used Claude extensively in analyzing the enormous amounts of data generated during this very investigation.<\/p>\n<p class=\"Body_reading-column__t7kGM paragraph-m post-text\">A fundamental change has occurred in cybersecurity. We advise security teams to experiment with applying AI for defense in areas like Security Operations Center automation, threat detection, vulnerability assessment, and incident response. We also advise developers to continue to invest in safeguards across their AI platforms, to prevent adversarial misuse. The techniques described above will doubtless be used by many more attackers\u2014which makes industry threat sharing, improved detection methods, and stronger safety controls all the more critical.<\/p>\n<p class=\"Body_reading-column__t7kGM paragraph-m post-text\">Read <a href=\"https:\/\/assets.anthropic.com\/m\/ec212e6566a0d47\/original\/Disrupting-the-first-reported-AI-orchestrated-cyber-espionage-campaign.pdf\" rel=\"nofollow noopener\" target=\"_blank\">the full report<\/a>.<\/p>\n","protected":false},"excerpt":{"rendered":"We recently argued that an inflection point had been reached in cybersecurity: a point at which AI models&hellip;\n","protected":false},"author":2,"featured_media":289812,"comment_status":"","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[45],"tags":[182,181,507,74],"class_list":{"0":"post-289811","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-artificial-intelligence","8":"tag-ai","9":"tag-artificial-intelligence","10":"tag-artificialintelligence","11":"tag-technology"},"_links":{"self":[{"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/posts\/289811","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/comments?post=289811"}],"version-history":[{"count":0,"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/posts\/289811\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/media\/289812"}],"wp:attachment":[{"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/media?parent=289811"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/categories?post=289811"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/tags?post=289811"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}