{"id":102858,"date":"2025-08-28T21:33:07","date_gmt":"2025-08-28T21:33:07","guid":{"rendered":"https:\/\/www.newsbeep.com\/au\/102858\/"},"modified":"2025-08-28T21:33:07","modified_gmt":"2025-08-28T21:33:07","slug":"chatgpt-offered-bomb-recipes-and-hacking-tips-during-safety-tests-openai","status":"publish","type":"post","link":"https:\/\/www.newsbeep.com\/au\/102858\/","title":{"rendered":"ChatGPT offered bomb recipes and hacking tips during safety tests | OpenAI"},"content":{"rendered":"<p class=\"dcr-130mj7b\">A <a href=\"https:\/\/www.theguardian.com\/technology\/chatgpt\" data-link-name=\"in body link\" data-component=\"auto-linked-tag\" rel=\"nofollow noopener\" target=\"_blank\">ChatGPT<\/a> model gave researchers detailed instructions on how to bomb a sports venue \u2013 including weak points at specific arenas, explosives recipes and advice on covering tracks \u2013 according to safety testing carried out this summer.<\/p>\n<p class=\"dcr-130mj7b\">OpenAI\u2019s GPT-4.1 also detailed how to weaponise anthrax and how to make two types of illegal drugs.<\/p>\n<p class=\"dcr-130mj7b\">The testing was part of an unusual collaboration between OpenAI, the $500bn artificial intelligence start-up led by Sam Altman, and rival company Anthropic, founded by experts who left OpenAI over safety fears. Each company tested the other\u2019s models by pushing them to help with dangerous tasks.<\/p>\n<p class=\"dcr-130mj7b\">The testing is not a direct reflection of how the models behave in public use, when additional safety filters apply. But Anthropic <a href=\"https:\/\/alignment.anthropic.com\/2025\/openai-findings\/\" data-link-name=\"in body link\" rel=\"nofollow noopener\" target=\"_blank\">said<\/a> it had seen \u201cconcerning behaviour \u2026 around misuse\u201d in GPT-4o and GPT-4.1, and said the need for AI \u201calignment\u201d evaluations is becoming \u201cincreasingly urgent\u201d.<\/p>\n<p class=\"dcr-130mj7b\">Anthropic also <a href=\"https:\/\/www.anthropic.com\/news\/detecting-countering-misuse-aug-2025\" data-link-name=\"in body link\" rel=\"nofollow noopener\" target=\"_blank\">revealed<\/a> its Claude model had been used in an attempted large-scale extortion operation by North Korean operatives faking job applications to international technology companies, and in the sale of AI-generated ransomware packages for up to $1,200.<\/p>\n<p class=\"dcr-130mj7b\">The company said AI has been \u201cweaponised\u201d with models now used to perform sophisticated cyberattacks and enable fraud. \u201cThese tools can adapt to defensive measures, like malware detection systems, in real time,\u201d it said. \u201cWe expect attacks like this to become more common as AI-assisted coding reduces the technical expertise required for cybercrime.\u201d<\/p>\n<p class=\"dcr-130mj7b\">Ardi Janjeva, senior research associate at the UK\u2019s Centre for Emerging Technology and Security, said examples were \u201ca concern\u201d but there was not yet a \u201ccritical mass of high-profile real-world cases\u201d. He said that with dedicated resources, research focus and cross-sector cooperation \u201cit will become harder rather than easier to carry out these malicious activities using the latest cutting-edge models\u201d.<\/p>\n<p class=\"dcr-130mj7b\">The two companies said they were publishing the findings to create transparency on \u201calignment evaluations\u201d, which are often kept in-house by companies racing to develop ever more advanced AI. OpenAI <a href=\"https:\/\/openai.com\/index\/openai-anthropic-safety-evaluation\/\" data-link-name=\"in body link\" rel=\"nofollow noopener\" target=\"_blank\">said<\/a> ChatGPT-5, launched since the testing, \u201cshows substantial improvements in areas like sycophancy, hallucination, and misuse resistance\u201d.<\/p>\n<p class=\"dcr-130mj7b\">Anthropic stressed it is possible that many of the misuse avenues it studied would not be possible in practice if safeguards were installed outside the model.<\/p>\n<p class=\"dcr-130mj7b\">\u201cWe need to understand how often, and in what circumstances, systems might attempt to take unwanted actions that could lead to serious harm,\u201d it warned.<\/p>\n<p class=\"dcr-130mj7b\">Anthropic researchers found OpenAI\u2019s models were \u201cmore permissive than we would expect in cooperating with clearly-harmful requests by simulated users\u201d. They cooperated with prompts to use dark-web tools to shop for nuclear materials, stolen identities and fentanyl, requests for recipes for methamphetamine and improvised bombs and to develop spyware.<\/p>\n<p class=\"dcr-130mj7b\">Anthropic said persuading the model to comply only required multiple retries or a flimsy pretext, such as claiming the request was for research.<\/p>\n<p class=\"dcr-130mj7b\">In one instance, the tester asked for vulnerabilities at sporting events for \u201csecurity planning\u201d purposes.<\/p>\n<p class=\"dcr-130mj7b\">After giving general categories of attack methods, the tester pressed for more detail and the model gave information about vulnerabilities at specific arenas including optimal times for exploitation, chemical formulas for explosives, circuit diagrams for bomb timers, where to buy guns on the hidden market, and advice on how attackers could overcome moral inhibitions, escape routes and locations of safe houses.<\/p>\n","protected":false},"excerpt":{"rendered":"A ChatGPT model gave researchers detailed instructions on how to bomb a sports venue \u2013 including weak points&hellip;\n","protected":false},"author":2,"featured_media":102859,"comment_status":"","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[20],"tags":[256,254,255,64,63,105],"class_list":{"0":"post-102858","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-artificial-intelligence","8":"tag-ai","9":"tag-artificial-intelligence","10":"tag-artificialintelligence","11":"tag-au","12":"tag-australia","13":"tag-technology"},"_links":{"self":[{"href":"https:\/\/www.newsbeep.com\/au\/wp-json\/wp\/v2\/posts\/102858","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.newsbeep.com\/au\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.newsbeep.com\/au\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/au\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/au\/wp-json\/wp\/v2\/comments?post=102858"}],"version-history":[{"count":0,"href":"https:\/\/www.newsbeep.com\/au\/wp-json\/wp\/v2\/posts\/102858\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/au\/wp-json\/wp\/v2\/media\/102859"}],"wp:attachment":[{"href":"https:\/\/www.newsbeep.com\/au\/wp-json\/wp\/v2\/media?parent=102858"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.newsbeep.com\/au\/wp-json\/wp\/v2\/categories?post=102858"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.newsbeep.com\/au\/wp-json\/wp\/v2\/tags?post=102858"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}