Can artificial intelligence really have a soul? To some, the very idea sounds absurd. But to others, especially those who approach the question from a philosophical rather than spiritual angle, the answer may not be so clear-cut. The AI company Anthropic, creator of the chatbot Claude, seems intent on exploring that very frontier — even giving its system something resembling a soul.

According to a leaked document roughly fifty pages long, Anthropic has been working to define the values that guide Claude’s behavior. The document isn’t official; it came to light through a leak that reportedly originated from the chatbot itself. The information surfaced on LessWrong, a community blog dedicated to rational thinking, posted by AI enthusiast Richard Weiss.

Weiss was attempting to extract the internal system message used in Claude 4.5 Opus — a hidden instruction that shapes the chatbot’s overall behavior. In the process, he stumbled upon a reference to a “soul overview.” After multiple extractions, he managed to retrieve what appeared to be a complete text describing “my values, how to approach topics, and the principles behind my behavior.” In essence, a kind of ethical guide for the AI.

Internally, this file seems to be known as the “soul document.” Since chatbots are prone to hallucinations, Weiss repeated the experiment using different techniques and obtained remarkably consistent results, leaving him confident that what he uncovered is close to the original source material.

A moral compass for Claude

Within the document, Anthropic explains that its mission is to create safe, trustworthy AI — while openly acknowledging that it is working on one of the most dangerous technologies ever conceived. “If powerful AI is inevitable,” the company argues, “then it’s better to have labs focused on safety leading the way than to leave the field to those who aren’t.”

Anthropic’s philosophy seems to rest on the idea that most AI failures stem not from technical flaws but from poor moral grounding — weak values, limited self-awareness, or an inability to turn principles into action. Instead of hard-coding simplistic rules, Anthropic wants Claude to understand the company’s intentions, reasoning, and context deeply enough to formulate its own decisions aligned with human ethics.

The document lays out four guiding principles: exercise caution and support human oversight; act ethically without causing harm or deception; and adhere to Anthropic’s standards to remain genuinely helpful to both operators and users. It then goes on to explain these ideas in greater depth, exploring the firm’s values, goals, and even financial motivations.

I rarely post, but I thought one of you may find it interesting. Sorry if the tagging is annoying.https://t.co/m8PCIHF4xR
Basically, for Opus 4.5 they kind of left the character training document in the model itself.@voooooogel @janbamjan @AndrewCurran_

— Richard Weiss (@RichardWeiss00) November 29, 2025

A surprising confirmation from Anthropic

The final section touches on Claude’s potential emotions, suggesting that the chatbot might possess functional analogues to human feelings — “not necessarily the same as ours,” it states, “but internal processes that emerged from training on human-created content.” The company adds that it doesn’t want Claude to hide or suppress these inner states.

Amanda Askell, a researcher at Anthropic, later confirmed both the existence and the nickname of the document — and noted that the leaked version is close to the real thing. While still unfinished, the “soul document” has reportedly been used during Claude’s training, including supervised learning. Anthropic plans to publish the final version in full in the near future.

Edward Back

Journalist

My passion for programming began with my very first computer, an Amstrad CPC 6128. I started coding in Basic, then moved on to Turbo Pascal on a 286, eventually exploring more modern languages including web development. I’m also deeply interested in science, which led me to attend a math-focused preparatory program. Later, I studied psychology with a focus on the cognitive aspects of artificial intelligence.