Published on
01/10/2025 – 7:00 GMT+2


ADVERTISEMENT

Artificial intelligence (AI) tools are worse than doctors and nurses at prioritising emergency room patients, a small new study has found.

The findings indicate that while AI holds a great deal of promise in the medical realm, health workers should not outsource decisions about the care of emergency patients whose lives may be at risk, the researchers said.

“Given the rapid development of AI tools like ChatGPT, we aimed to explore whether AI could support triage decision-making, improve efficiency, and reduce the burden on staff in emergency settings,” Dr Renata Jukneviciene, one of the study’s authors and a researcher at Vilnius University in Lithuania, said in a statement.

The research, which has not yet been reviewed by independent experts or published in a medical journal, was presented at the European Emergency Medicine Congress on Tuesday.

Jukneviciene’s team asked six emergency doctors and 44 nurses to review patient cases – randomly selected from an online medical database – and triage them, or classify them in order of urgency on a 1-5 scale.

The researchers then asked ChatGPT, the commonly used chatbot from OpenAI, to analyse the same cases.

ChatGPT’s overall accuracy rate was 50.4 per cent, compared with 65.5 per cent for nurses and 70.6 per cent for doctors. There was an even bigger gap in sensitivity, or the ability to identify truly urgent cases, with ChatGPT reaching 58.3 per cent compared to 73.8 per cent among nurses and 83 per cent among doctors, the study found.

However, the AI model did outperform nurses when it came to identifying the most urgent or life-threatening cases, with both better accuracy and specificity.

The findings indicate “AI may assist in prioritising the most urgent cases more consistently and in supporting new or less experienced staff,” Jukneviciene said.

However, ChatGPT was far more likely than either doctors or nurses to classify cases as highly urgent, meaning “human oversight” would be important to prevent “inefficiencies,” Jukneviciene said.

“Hospitals should approach AI implementation with caution and focus on training staff to critically interpret AI suggestions,” she said.

The study has some limitations, notably the small size and the fact that it took place in a single hospital in Lithuania. The ChatGPT model used in the study had not been trained for medical purposes, meaning fine-tuned AI tools may have fared better.

Other research suggests that AI could be a boon for the health sector, outperforming human doctors at diagnosing complex medical issues, reading X-rays faster and more accurately, and making it possible to predict future health problems.

But scientists have also warned that an overreliance on AI tools could cause health workers’ skills to degrade over time.

Jukneviciene’s team is now planning additional studies using newer AI models, larger patient groups, and a range of scenarios, such as nurse training and interpretation of electrocardiogram (ECG or EKG) scans that identify unusual heart activity.

In the meantime, she said “AI should not replace clinical judgement, but could serve as a decision-support tool in specific clinical contexts and in overwhelmed emergency departments”.