ai concept GeorgePeters/DigitalVision Vectors/Getty Images

Follow ZDNET: Add us as a preferred source on Google.

ZDNET key takeaways:  Humans are misusing the medical term hallucination to describe AI errors The medical term confabulation is a better approximation of faulty AI output Dropping the term hallucination helps dispel myths about AI.

The expression “AI hallucination” is well-known to anyone who’s experienced ChatGPT or Gemini or Perplexity spouting obvious falsehoods, which is pretty much anyone who’s ever used an AI chatbot.

Only, it’s an expression that’s incorrect. The proper term for when a large language model or other generative AI program asserts falsehoods is not a hallucination but a “confabulation.” AI doesn’t hallucinate, it confabulates.

Also: 3 ways AI agents will make your job unrecognizable in the next few years

The word confabulation is also from the psychology literature, just like hallucination, but they mean very different things. 

A hallucination is a conscious sensory perception that is at variance with the stimuli in the environment. A confabulation, on the other hand, is the making of assertions that are at variance with the facts, such as “the president of France is Francois Mitterrand,” which is currently not the case.

The former implies conscious perception, the latter may involve consciousness in humans, but it can also encompass utterances that don’t involve consciousness and are merely inaccurate statements.

Psychologists are beginning to push back on the use of hallucination and emphasize the importance of using confabulation. 

Also: I tested Opus 4.5 to see if it’s really ‘the best in the world’ at coding – and things got weird fast

“The medical term hallucination, borrowed from human experience and its disorders, does not accurately describe this malfunction of AI,” wrote Gerald Weist of the Department of Neurology at the Medical University of Vienna, Austria, and Oliver H. Turnbull of the Department of Psychology at Bangor University in the UK, in the October issue of the New England Journal of Medicine Artificial Intelligence, an imprint of the prestigious medical journal. 

“We argue that the medical term ‘confabulation’ provides a more precise description than hallucination,” they wrote.

The problem with misconceiving AI 

The distinction is crucial for anyone using generative AI, as terms can create myths and misconceptions that lead to unrealistic and even dangerous expectations about the technology.

As described in a New York Times investigation published this week by reporters Kashmir Hill and Jennifer Valentino-DeVries, when users relate to AI chatbots as confidants and friends — when they ascribe conscious intent to the bots — it can lead users to ascribe truth and importance to bots that can have disastrous consequences for users. 

“The Times has uncovered nearly 50 cases of people having mental health crises during conversations with ChatGPT,” they wrote, among which, “nine were hospitalized; three died.”

Also: How Microsoft Entra aims to keep your AI agents from running wild

The authors don’t ascribe any of that specifically to the term hallucination, but hallucination is one of those misapplied terms that imply agency and consciousness on the part of what is simply a software program producing output, albeit convincing-sounding output. 

People are inclined to attribute sentience and even consciousness to the technology, but there’s no evidence of either. It’s clear that the language we use influences such views.

In an ominous precedent of the mistaken views outlined in the Times article, even supposed experts in AI technology have ascribed AI models’ impressive text generation to sentience and/or consciousness. 

Months before the release of ChatGPT, in the summer of 2022, former Google engineer Blake Lemoin urged the company to take seriously his assertion that the then-cutting-edge AI model LaMDA was a sentient entity. 

Lemoin, after spending hours upon hours chatting with the bot, made the case that LaMDA probably was sentient because “it has worries about the future and reminisces about the past.”  

Also: Inside the making of Gemini 3 – how Google’s slow and steady approach won the AI race (for now)

Lemoin’s conviction was evidence that people will talk themselves into ascribing qualities to the machine by employing psychological terms such as “worry” or “fear.”

Again, the language we use to describe AI is pivotal to how humans see AI, and for that reason, borrowed terms such as hallucination should be examined and maybe even discarded. 

The twisted history of AI hallucinations

According to both Gemini and ChatGPT, the term “hallucination” has a long and rich history in artificial intelligence, preceding its recent use. 

An early use, by Eric Mjolsness, was in the 1980s, in the application of neural networks to recognize fingerprints. Mjolsness used “hallucination” in a positive sense, as the ability of a computer vision system to extract a clean pattern of the lines of prints from a noisy image.

Decades later, but before the emergence of ChatGPT, the term started to take on a negative connotation. An example is a 2015 blog post by Andrej Karpathy, a former Tesla AI scientist and co-founder of OpenAI, discussing neural networks that generate text. 

Also: Vibe coding feels magical, but it can sink your business fast – here’s how

Karpathy observed that neural networks could generate convincing examples of Wikipedia entries or mathematical formulas, but that they generated false web URLs or meaningless equations, writing, “the model just hallucinated it.”

With the explosion in popular use of ChatGPT and large language models, the public increasingly described the shortcomings of AI as hallucinations. 

But the term has even spread to scholarly work, where those who should know better have made sloppy and inconsistent use of it. Negar Maleki and colleagues at the University of Maryland, in a survey last year, identified 333 papers with references to “AI hallucination,” “hallucination in AI,” and similar terms, and concluded, “the term ‘AI hallucination’ lacks a precise, universally accepted definition.”

Confabulation seems a better analogy 

Scholars such as Karpathy know a lot about AI, but they’re not doctors, and they’re not psychologists, and it pays to listen to what those disciplines have to say. 

Also: Why AI coding tools like Cursor and Replit are doomed – and what comes next

For years now, medical professionals have been trying to tell us we don’t know what we’re talking about when we talk about AI hallucinating. 

“Hallucination is a medical term used to describe a sensory perception occurring in the absence of an external stimulus,” wrote Søren Dinesen Østergaard and colleagues at the Department of Clinical Medicine at Aarhus University in  Denmark, in a 2023 survey of AI literature.

“AI models do not have sensory perceptions as such — and when they make errors, it does not occur in the absence of external stimulus,” they wrote. “Rather, the data on which AI models are trained can (metaphorically) be considered as external stimuli — as can the prompts eliciting the (occasionally false) responses.”

In other words, the analogy doesn’t fit the definition of hallucination even in the most basic sense.

Also: How AI hallucinations could help create life-saving antibiotics

In their NEJM AI paper, Wiest and Turnbull described the case against hallucination and in favor of confabulation as a less-bad analogy. 

“A hallucination is a spontaneous perception in any sensory modality (e.g., visual, auditory, olfactory), without a genuine external stimulus,” they wrote. As such, “they are essentially passive phenomena rooted in conscious (mis)perception. Critically, AI lacks this conscious element.”

In contrast, they wrote, “‘Confabulation’, on the other hand, refers to the active generation of objectively false information or opinions that again bear no relation to reality,” and, “If the analogy of AI malfunctions mirroring the human mind is to be maintained, these AI errors clearly take the form of active confabulatory generation, rather than a passive and conscious hallucinatory perception.”

Wiest and Turnbull’s points echo remarks I’ve heard for a long time from neuroscientists, including those who extol the achievements of AI. 

In an interview for ZDNET last year, AI scholar Terry Sejnowski, who has developed neural network technology for over four decades, and who is also a trained neuroscientist working at the Scripps Institute in La Jolla, California, told me, “AI has renamed everything: the ‘hallucination,’ in neuroscience, is called confabulation, which I think is closer to what’s really going on.”

Also: ‘Hallucinating’ AI makes it harder than ever to hide from surveillance

Scholars are beginning to incorporate confabulation into their writing about AI.  

In a research paper published in April in the prestigious Journal of the American Medical Association, JAMA, Peter Elkin, M.D., and colleagues at the Department of Biomedical Informatics at the University at Buffalo, New York, described results of running large language models on medical board exams.

When it came time to discuss errors, Elkin and team were careful to refer to confabulations. “We defined confabulation as answering a question (vs remaining silent) with the wrong answer (ie, a false-positive response),” they wrote. “We measured the confabulation rate as the count of wrong nonnull answers.”

Let’s set the record straight

Probably, confabulation is not an ideal term, either. In their 2023 paper, Østergaard and team warned that any references to psychological terms in AI could “stigmatize” actual human conditions such as schizophrenia by associating human hallucination with a malfunction. They proposed instead describing AI errors with terms such as “non-sequitur” or “unrelated response.”

And, in a study of AI models in medical diagnosis published in May in JAMA, Mitchell Feldman and fellow M.D.s at the computer science laboratory of Massachusetts General Hospital in Boston, make the case that confabulation, too, has its issues.

Also: Meta warns its new chatbot may forget that it’s a bot

Feldman and team observed that “the most negative characteristics” of the large language models “include […] the lack of reliability (generative AI can ‘confabulate’ or ‘hallucinate’ and craft responses with entirely false facts).” 

They add, “Confabulation or hallucination imply an element of volition or consciousness that cannot yet be ascribed to LLMs at the level of human capability. Confabulation might be better termed an algorithmic shortcoming due to probabilistic adjacency.”

But “algorithmic shortcoming” is not as snappy for most non-technical humans, and so, probably, some kind of analogizing is going to take place.

Also: The new Turing test: Are you human?

No analogy is perfect, but it seems to the extent that humans must analogize machine functioning to human mental functioning, between hallucination and confabulation, the one that doesn’t imply consciousness seems a step in the right direction. 

So, let’s set the record straight: People may hallucinate, and they may confabulate, in the sense of asserting what they believe to be true despite the facts. Machines don’t hallucinate in any sense consistent with the term, while they may produce output that is counter-factual in ways that we can analogize as confabulations.