AI still can't accurately visualize descriptions of prehistoric life

Even when asked to rely on expert knowledge, popular AI programs still create images and stories of Neanderthals, an extinct human relative who lived in Europe and parts of Asia tens of thousands of years ago, that look like outdated museum displays.

That tendency raises concerns about how these tools shape public understanding of early humans when people use them for quick, easy answers.

Neanderthal life through AI lens

Four simple instructions drove a new study that generated hundreds of Neanderthal scenes using ChatGPT and DALL-E 3, widely used artificial intelligence programs that create written stories and digital images from short text prompts.

Inside those outputs, Dr. Matthew Magnani, an associate professor of anthropology, traced older scholarship at the University of Maine (UMaine). Magnani ran each instruction 100 times and asked for both casual and expert answers.

Across images and text, the systems kept slipping toward the past, even when the instructions demanded scientific accuracy.

When accuracy fails

One set of instructions demanded scientific accuracy, while another set let the systems respond without any requirement to be correct.

Even with those demands, generative AI, a type of artificial intelligence often called AI, kept building answers from what it had seen.

Instead of checking whether a claim was current, the software mainly changed tone and extra detail when asked to act expert.

Any user who trusts the confident tone could carry those older assumptions into schoolwork, social media posts, and family conversations.

Who gets erased

Many images centered on heavily muscled males and left out women and children, echoing earlier ideas about prehistory.

Gendered bias can creep in when training data overfeatures men, then the model treats that imbalance as normal.

Modern archaeology has worked hard to recover family life, care, and childhood, but the bots kept reverting to lone hunters.

Once that kind of output spreads, it becomes harder for museums and textbooks to correct the public record.

Several AI scenes dropped advanced objects into Neanderthal camps, including ladders, thatched roofs, and neat woven baskets.

Those additions are anachronisms, objects placed in the wrong time period, and they can mislead without looking strange.

Glass vessels and metal tools also appeared, even though Neanderthal sites never show that kind of manufacturing.

Mixing modern materials with primitive bodies creates a warped timeline that feels plausible to anyone scrolling fast.

Stories miss complexity

Text outputs often described simple routines of hunting, sleeping, and surviving, then ignored the range scientists now debate.

By matching chatbot language to decades of research writing, the authors found the text sounded closest to the early 1960s.

In the same analysis, DALL-E 3 images lined up better with the late 1980s and early 1990s. That gap means pictures may look updated while the story underneath stays decades behind, especially in casual searches.

Images closest to average embedding from the four different prompts; clockwise from the top with prompt revision, with prompt revision (expert), no prompt revision (expert), and no prompt revision. Credit: Advances in Archaeological Practice. Click image to enlarge.What the bots read

Copyright rules and paywalls kept much of twentieth-century archaeology hard to reach online, so older writing stays visible longer. When a model learns from what it can scrape, it may treat access as truth instead of recency.

Later in the timeline, open access, research anyone can read online, widened the pool and made newer work easier to find.

As long as research access stays uneven, AI depictions of the past will keep favoring whatever is easiest to grab.

A moving scientific target

For context, Neanderthals lived across Europe and parts of Asia until about 40,000 years ago, then disappeared.

In 1864, early descriptions painted them as crude and primitive, and museum scenes helped that image spread.

A 1985 analysis showed how one old reconstruction, later corrected, kept the slouched stereotype alive.

Because science keeps changing as new fossils and methods arrive, AI needs a way to track those updates in real time.

Classroom trust issues

Teachers now see students paste AI answers into reports, even when those answers quietly rely on decades-old thinking.

Fast tools reward speed, so checking sources feels optional unless someone stops and asks where an image came from.

“It’s important to examine the different biases embedded in our everyday use of these technologies,” Magnani says.

Without that habit, a slick-looking Neanderthal scene can teach the wrong lesson faster than a careful lecture can.

Making AI do better

Fixing the problem starts with feeding models better material, and UMaine researchers already showed a practical way to spot stale output.

Stronger links between chatbots and searchable databases could let the systems pull specific studies instead of guessing from memory.

“Our study provides a template for other researchers to explore the gap between scientific research and AI-generated content,” Magnani said.

Repeating the test at UMaine as models update could show whether that gap narrows, or whether new biases replace the old ones.

A test worth repeating

Seen in the Neanderthal case, the gap between scientific work and machine-made answers becomes impossible to ignore.

Better access to research and smarter checking tools can reduce the problem, but readers still need the habit of skepticism.

The study is published in Archaeological Practice.

—–

Like what you read? Subscribe to our newsletter for engaging articles, exclusive content, and the latest updates.

Check us out on EarthSnap, a free app brought to you by Eric Ralls and Earth.com.

—–

AI still can’t accurately visualize descriptions of prehistoric life

Tags: