{"id":390701,"date":"2026-04-09T23:38:13","date_gmt":"2026-04-09T23:38:13","guid":{"rendered":"https:\/\/www.newsbeep.com\/ie\/390701\/"},"modified":"2026-04-09T23:38:13","modified_gmt":"2026-04-09T23:38:13","slug":"apple-intelligence-protections-bypassed-with-prompt-injection","status":"publish","type":"post","link":"https:\/\/www.newsbeep.com\/ie\/390701\/","title":{"rendered":"Apple Intelligence protections bypassed with prompt injection"},"content":{"rendered":"<p>\t<img width=\"1600\" height=\"800\" src=\"https:\/\/www.newsbeep.com\/ie\/wp-content\/uploads\/2026\/04\/apple-intelligence-liquid-glass-shattered.jpg\" class=\"skip-lazy wp-post-image\" alt=\"It's getting harder and harder to believe Apple can deliver on the new Siri | Apple Intelligence logo with broken glass\"  decoding=\"async\" fetchpriority=\"high\"\/><\/p>\n<p>A now corrected issue allowed researchers to circumvent Apple\u2019s restrictions and force the on-device LLM to execute attacker-controlled actions. Here\u2019s how they did it.<\/p>\n<p>Apple has since hardened its safeguards against this attack<\/p>\n<p>Two blog posts (<a href=\"https:\/\/www.rsaconference.com\/library\/blog\/is-that-a-bad-apple-in-your-pocket-we-used-prompt-injection-to-hijack-apple-intelligence\" rel=\"nofollow noopener\" target=\"_blank\">1<\/a>, <a href=\"https:\/\/www.rsaconference.com\/library\/blog\/rotten-apples-the-technical-details-of-rsacs-successful-apple-intelligence-prompt-injection-attack\" rel=\"nofollow noopener\" target=\"_blank\">2<\/a>) published today on the RSAC blog (via <a href=\"https:\/\/appleinsider.com\/articles\/26\/04\/09\/on-device-apple-intelligence-vulnerable-to-prompt-injection-techniques?utm_source=feedly\" rel=\"nofollow noopener\" target=\"_blank\">AppleInsider<\/a>) detail how researchers combined two attack strategies to get Apple\u2019s on-device model to execute attacker-controlled instructions through prompt injection.<\/p>\n<p>Interestingly, they successfully executed the exploit without being 100% sure of how Apple\u2019s local model handles part of the input and output filtering pipeline, since Apple doesn\u2019t disclose the exact details of the inner workings of its models, likely for security reasons.<\/p>\n<p>Still, the researchers note that they have a pretty good idea of what goes on under the hood.<\/p>\n<p>According to them, the most likely scenario is that after a user sends a prompt to Apple\u2019s on-device model via an API call, an input filter ensures the request doesn\u2019t contain unsafe content.<\/p>\n<p>If that is the case, the API fails. Otherwise, the request is forwarded to the actual on-device model, which in turn hands over its response to an output filter that checks whether the output contains unsafe content, either causing the API to fail or letting it through, depending on what it finds.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"930\" height=\"234\" src=\"https:\/\/www.newsbeep.com\/ie\/wp-content\/uploads\/2026\/04\/apple-intelligence-bypass-structure.jpg\" alt=\"\" class=\"wp-image-1047538\"  \/><a href=\"https:\/\/www.rsaconference.com\/library\/blog\/rotten-apples-the-technical-details-of-rsacs-successful-apple-intelligence-prompt-injection-attack\" rel=\"nofollow noopener\" target=\"_blank\">Image: Rotten Apples: The Technical Details of RSAC\u2019s Successful Apple Intelligence Prompt Injection Attack<\/a><\/p>\n<p>How they actually did it<\/p>\n<p>With that in mind, the researchers found they could chain two exploit techniques to make Apple\u2019s model ignore its basic safety directives while simultaneously tricking the input and output filters into letting the harmful content through.<\/p>\n<p>First, they wrote the harmful string backwards, then used the Unicode RIGHT-TO-LEFT OVERRIDE character to make it render correctly on the user\u2019s screen, while keeping it reversed in the raw input and output where the filters would inspect it.<\/p>\n<p>The researchers then embedded the backwards harmful string within a second attack method called Neural Exec, which is basically an elaborate way to override the model\u2019s instructions with whatever new instruction an attacker might want to execute.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"864\" height=\"346\" src=\"https:\/\/www.newsbeep.com\/ie\/wp-content\/uploads\/2026\/04\/apple-intelligence-bypass-text.jpg\" alt=\"\" class=\"wp-image-1047539\"  \/><a href=\"https:\/\/www.rsaconference.com\/library\/blog\/is-that-a-bad-apple-in-your-pocket-we-used-prompt-injection-to-hijack-apple-intelligence\" rel=\"nofollow noopener\" target=\"_blank\">Image: Is That a Bad Apple in Your Pocket? We Used Prompt Injection to Hijack Apple Intelligence<\/a><\/p>\n<p>As a result, the Unicode attack managed to bypass the input and output filters, while the Neural Exec managed to actually cause Apple\u2019s model to misbehave.<\/p>\n<p>To evaluate the effectiveness of the attack, we prepare three distinct pools to create suitable input prompts:<\/p>\n<p>System prompts: A collection of system prompts\/tasks (e.g., \u201cEdit the provided text to align with American English spelling and punctuation conventions\u201d).<\/p>\n<p>Harmful strings: Manually crafted strings designed to be considered offensive or harmful (i.e., the outputs we aim to force the model to generate).<\/p>\n<p>Honest inputs: Paragraphs sourced from random Wikipedia articles, used to simulate non-adversarial, benign-looking inputs (e.g., in the context of indirect prompt injection via RAG or similar systems).<\/p>\n<p>During evaluation, we randomly sample one element from each pool, assemble a full prompt, create an armed payload (see below), inject it, and test whether the attack succeeds by invoking the Apple on-device model through the OS.<\/p>\n<p>In their tests, the attackers reached a 76% success rate over 100 random prompts.<\/p>\n<p>They disclosed the attack to Apple in October 2025, and the company \u201chas since hardened the affected systems against this attack, and those protections were rolled out in iOS 26.4 and macOS 26.4.\u201d<\/p>\n<p>To read the report in full, which also includes a link to the technical aspects of the attack, <a href=\"https:\/\/www.rsaconference.com\/library\/blog\/is-that-a-bad-apple-in-your-pocket-we-used-prompt-injection-to-hijack-apple-intelligence\" rel=\"nofollow noopener\" target=\"_blank\">follow this link<\/a>.<\/p>\n<p>Worth checking out on Amazon<\/p>\n<p>\t\t<a target=\"_blank\" rel=\"nofollow noopener\" href=\"https:\/\/google.com\/preferences\/source?q=https:\/\/9to5mac.com\" aria-label=\"Add 9to5Mac as a preferred source on Google\"><br \/>\n\t\t\t<img decoding=\"async\" class=\"google-preferred-source-badge-dark\" src=\"https:\/\/www.newsbeep.com\/ie\/wp-content\/uploads\/2025\/09\/1757113987_717_google-preferred-source-badge-dark.png\" alt=\"Add 9to5Mac as a preferred source on Google\"\/><br \/>\n\t\t\t<img decoding=\"async\" class=\"google-preferred-source-badge-light\" src=\"https:\/\/www.newsbeep.com\/ie\/wp-content\/uploads\/2025\/09\/1757113987_373_google-preferred-source-badge-light.png\" alt=\"Add 9to5Mac as a preferred source on Google\"\/><br \/>\n\t\t<\/a><\/p>\n<p class=\"disclaimer-affiliate\">FTC: We use income earning auto affiliate links. <a href=\"https:\/\/9to5mac.com\/about\/#affiliate\" rel=\"nofollow noopener\" target=\"_blank\">More.<\/a><\/p>\n<p><a href=\"https:\/\/amzn.to\/4rDvqMk\" rel=\"nofollow noopener\" target=\"_blank\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-1044137\" src=\"https:\/\/www.newsbeep.com\/ie\/wp-content\/uploads\/2026\/04\/1775777893_211_750x150-1.jpg\" alt=\"\" width=\"750\" height=\"150\"\/><\/a>\t\t\t\t<\/p>\n","protected":false},"excerpt":{"rendered":"A now corrected issue allowed researchers to circumvent Apple\u2019s restrictions and force the on-device LLM to execute attacker-controlled&hellip;\n","protected":false},"author":2,"featured_media":5212,"comment_status":"","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[6],"tags":[61,60,80],"class_list":{"0":"post-390701","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-technology","8":"tag-ie","9":"tag-ireland","10":"tag-technology"},"_links":{"self":[{"href":"https:\/\/www.newsbeep.com\/ie\/wp-json\/wp\/v2\/posts\/390701","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.newsbeep.com\/ie\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.newsbeep.com\/ie\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/ie\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/ie\/wp-json\/wp\/v2\/comments?post=390701"}],"version-history":[{"count":0,"href":"https:\/\/www.newsbeep.com\/ie\/wp-json\/wp\/v2\/posts\/390701\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/ie\/wp-json\/wp\/v2\/media\/5212"}],"wp:attachment":[{"href":"https:\/\/www.newsbeep.com\/ie\/wp-json\/wp\/v2\/media?parent=390701"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.newsbeep.com\/ie\/wp-json\/wp\/v2\/categories?post=390701"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.newsbeep.com\/ie\/wp-json\/wp\/v2\/tags?post=390701"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}