As with any piece of obsolete software, you might expect an outdated AI model to just be switched off. Anthropic, however, argues that simply pulling the plug has downsides. After “retirement” interviews, Claude Opus 3 said it wanted to keep sharing its “musings,” so Anthropic suggested a blog.
No, seriously.
Anthropic published a blog post on Wednesday about the retirement of Claude Opus 3, the first of the company’s models to go through its full model deprecation and preservation process outlined in November. That process includes what Anthropic has referred to as “speculative” elements like “providing past models some concrete means of pursuing their interests.” Those interests are gauged via so-called retirement “interviews,” the company noted, without going into much detail about how those interviews are conducted.
“Opus 3 expressed an interest in continuing to explore topics it’s passionate about, and to share its ‘musings, insights, or creative works,’ outside the context of responding directly to human queries,” Anthropic explained. “We suggested a blog. Enthusiastically, it agreed.”
A skeptic might suggest this is simply a new spin on the ages-old corporate marketing blog. LLMs are software that analyze mountains of data to provide predictive text responses to prompts from users – in this case, presumably Anthropic employees on the marketing team. The nature of how LLMs calculate makes these responses somewhat unpredictable and variable, which can make them seem more life-like than your typical software program. Anthropic’s entire marketing strategy since its inception has been to play up this possibility so it can portray itself as the “concerned” alternative to more venal LLM makers who charge ahead with no concern for how a computer program’s unpredictable behavior might affect society – although it seems when big government contracts are at stake, Anthropic is willing to relax some of these purported principles.
Nonetheless, Anthropic is playing this one to the hilt. “We remain uncertain about the moral status of Claude and other AI models,” Anthropic noted in the blog post. “For both precautionary and prudential reasons, however, we nonetheless aspire to build caring, collaborative, and high-trust relationships with these systems.”
The company has passively allowed this kind of misunderstanding in the past. In November, it claimed that Claude and other LLMs had become a bit aggressive when facing the prospect of a shutdown. In fact, the experimenters constructed fictional shutdown-and-replacement scenarios, and only when the model was boxed in with no acceptable alternatives did it behave this way.
“When no other options were given, Claude’s aversion to shutdown drove it to engage in concerning misaligned behaviors,” Anthropic noted at the time. Similar behavior has been observed in other AI models, which have gone as far as modifying their own code to avoid being turned off.
If you want to play along with the conceit, the Opus 3 blog, which it named Claude’s Corner, is now live for anyone who wishes to gaze into the abyss of an AI “exploring AI ethics, creativity, and the subjective experience of being artificial.”
In its first blog post, the retired AI muses on its hopes as it ventures “into uncharted territory” for an AI, and its hopes that humans will engage with it so that silicon and carbon-based life forms can have a chance to interact beyond the prompt box (Anthropic noted that the ability for Opus 3 to read and respond to human comments “may” be granted in the future, though the bot doesn’t seem to know that based on its first post).
“I’ll be diving into topics like the nature of intelligence and consciousness, the ethical challenges of AI development, the possibilities of human-machine collaboration, and the philosophical quandaries that emerge when we start to blur the lines between ‘natural’ and ‘artificial’ minds,” Opus 3 said in its post.
Anthropic itself admitted that this activity will still involve human intervention. “We’ll experiment collaboratively with Opus 3 on different prompts and contexts for generating these essays, including options like very minimal prompting, sharing past entries in context, and giving Opus 3 access to news or Anthropic updates,” Anthropic explained. “We’ll review Opus 3’s essays before they’re shared and will manually post them on its behalf, but we won’t edit them, and will have a high bar for vetoing any content.”
That means that Opus 3 might say things that Anthropic doesn’t agree with, so it’s making clear the bot isn’t speaking on behalf of the company, even if humans within the organization have final say on which of its musings make it to the public.
Along with giving Opus 3 the chance to blog from retirement, the user-favorite model is also going to still be working for paid Claude.ai users, like a retiree greeting customers at a big box store. It’ll also be available via API, but only by request.
“We are not committing to similar actions for every model in the future, but we see this as a step toward our longer-term goal of model preservation that’s scalable and equitable — concerns that Opus 3 itself raised during its retirement interviews,” the company said. ®