Chen, J. H. & Asch, S. M. Machine learning and prediction in medicine—beyond the peak of inflated expectations. N. Engl. J. Med. 376, 2507–2509 (2017).

Article 
PubMed 
PubMed Central 

Google Scholar
 

Seneviratne, M. G., Shah, N. H. & Chu, L. Bridging the implementation gap of machine learning in healthcare. BMJ Innov. 6, 45–47 (2020).

Article 

Google Scholar
 

Sendak, M. P. et al. A path for translation of machine learning products into healthcare delivery. EMJ Innov. https://doi.org/10.33590/emjinnov/19-00172 (2020).

Article 

Google Scholar
 

McCradden, M. D., Stephenson, E. A. & Anderson, J. A. Clinical research underlies ethical integration of healthcare artificial intelligence. Nat. Med. 26, 1325–1326 (2020).

Article 
CAS 
PubMed 

Google Scholar
 

Sendak, M. et al. Editorial. Surfacing best practices for AI software development and integration in healthcare. Front. Digit. Health 5, 1150875 (2023).

Article 
PubMed 
PubMed Central 

Google Scholar
 

Wiens, J. et al. Do no harm: a roadmap for responsible machine learning for health care. Nat. Med. 25, 1337–1340 (2019).

Article 
CAS 
PubMed 

Google Scholar
 

Morse, K. E., Bagley, S. C. & Shah, N. H. Estimate the hidden deployment cost of predictive models to improve patient care. Nat. Med. 26, 18–19 (2020).

Article 
CAS 
PubMed 

Google Scholar
 

McCradden, M. D. et al. A research ethics framework for the clinical translation of healthcare machine learning. Am. J. Bioeth. 22, 8–22 (2022).

Article 
PubMed 

Google Scholar
 

Futoma, J., Simons, M., Panch, T., Doshi-Velez, F. & Celi, L. A. The myth of generalisability in clinical research and machine learning in health care. Lancet Digit. Health 2, e489–e492 (2020).

Article 
PubMed 
PubMed Central 

Google Scholar
 

Kim, C. et al. Multicentre external validation of a commercial artificial intelligence software to analyse chest radiographs in health screening environments with low disease prevalence. Eur. Radiol. 33, 3501–3509 (2023).

Article 
PubMed 

Google Scholar
 

Wong, A. et al. External validation of a widely implemented proprietary sepsis prediction model in hospitalized patients. JAMA Intern. Med. 181, 1065–1070 (2021).

Article 
PubMed 
PubMed Central 

Google Scholar
 

Harvey, H. B. & Gowda, V. How the FDA regulates AI. Acad. Radiol. 27, 58–61 (2020).

Article 
PubMed 

Google Scholar
 

Wu, E. et al. How medical AI devices are evaluated: limitations and recommendations from an analysis of FDA approvals. Nat. Med. 27, 582–584 (2021).

Article 
CAS 
PubMed 

Google Scholar
 

McCradden, M. D. et al. CANAIRI: the Collaboration for Translational Artificial Intelligence Trials in healthcare. Nat. Med. 31, 9–11 (2025).

Article 
CAS 
PubMed 

Google Scholar
 

Arksey, H. & O’Malley, L. Scoping studies: towards a methodological framework. Int. J. Soc. Res. Methodol. 8, 19–32 (2005).

Article 

Google Scholar
 

Manz, C. R. et al. Validation of a machine learning algorithm to predict 180-day mortality for outpatients with cancer. JAMA Oncol. 6, 1723–1730 (2020).

Article 
PubMed 
PubMed Central 

Google Scholar
 

Kwong, J. C. C. et al. When the model trains you: induced belief revision and its implications on artificial intelligence research and patient care—a case study on predicting obstructive hydronephrosis in children. NEJM AI 1, AIcs2300004 (2024).

Article 

Google Scholar
 

Sendak, M. P. et al. Real-world integration of a sepsis deep learning technology into routine clinical care: implementation study. JMIR Med. Inform. 8, e15182 (2020).

Article 
PubMed 
PubMed Central 

Google Scholar
 

Jauk, S. et al. Risk prediction of delirium in hospitalized patients using machine learning: an implementation and prospective evaluation study. J. Am. Med. Inform. Assoc. 27, 1383–1392 (2020).

Article 
PubMed 
PubMed Central 

Google Scholar
 

Stephen, R. J. et al. Sepsis prediction in hospitalized children: clinical decision support design and deployment. Hosp. Pediatr. 13, 751–759 (2023).

Article 
PubMed 

Google Scholar
 

Aakre, C. et al. Prospective validation of a near real-time EHR-integrated automated SOFA score calculator. Int. J. Med. Inform. 103, 1–6 (2017).

Article 
PubMed 

Google Scholar
 

R Core Team. R: A Language and Environment for Statistical Computing (R Foundation for Statistical Computing, 2025); https://www.r-project.org/

Posit Team. RStudio: Integrated Development Environment for R (Posit Software, PBC, 2025).

Escalé-Besa, A. et al. Exploring the potential of artificial intelligence in improving skin lesion diagnosis in primary care. Sci. Rep. 13, 4293 (2023).

Article 
PubMed 
PubMed Central 

Google Scholar
 

Rajakariar, K. et al. Accuracy of a smartwatch based single-lead electrocardiogram device in detection of atrial fibrillation. Heart 106, 665–670 (2020).

Article 
PubMed 

Google Scholar
 

Tan, P., Nyeko-Lacek, M., Walsh, K., Sheikh, Z. & Lewis, C. J. Artificial intelligence-enhanced multispectral imaging for burn wound assessment: insights from a multi-centre UK evaluation. Burns 51, 107550 (2025).

Article 
PubMed 

Google Scholar
 

Morse, K. E. et al. Monitoring approaches for a pediatric chronic kidney disease machine learning model. Appl. Clin. Inform. 13, 431–438 (2022).

Article 
PubMed 
PubMed Central 

Google Scholar
 

Afshar, M. et al. Deployment of real-time natural language processing and deep learning clinical decision support in the electronic health record: pipeline implementation for an opioid misuse screener in hospitalized adults. JMIR Med. Inform. 11, e44977 (2023).

Article 
PubMed 
PubMed Central 

Google Scholar
 

Sheppard, J. P. et al. Prospective external validation of the Predicting Out-of-OFfice Blood Pressure (PROOF-BP) strategy for triaging ambulatory monitoring in the diagnosis and management of hypertension: observational cohort study. BMJ 361, k2478 (2018).

Article 
PubMed 
PubMed Central 

Google Scholar
 

Wong, A. I. et al. Prediction of acute respiratory failure requiring advanced respiratory support in advance of interventions and treatment: a multivariable prediction model from electronic medical record data. Crit. Care Explor. 3, e0402 (2021).

PubMed 
PubMed Central 

Google Scholar
 

Ganapathi, S. et al. Tackling bias in AI health datasets through the STANDING Together initiative. Nat. Med. 28, 2232–2233 (2022).

Article 
CAS 
PubMed 

Google Scholar
 

Ouyang, D. et al. Video-based AI for beat-to-beat assessment of cardiac function. Nature 580, 252–256 (2020).

Article 
CAS 
PubMed 
PubMed Central 

Google Scholar
 

Razavian, N. et al. A validated, real-time prediction model for favorable outcomes in hospitalized COVID-19 patients. NPJ Digit. Med. 3, 130 (2020).

Article 
PubMed 
PubMed Central 

Google Scholar
 

Pou-Prom, C., Murray, J., Kuzulugil, S., Mamdani, M. & Verma, A. A. From compute to care: lessons learned from deploying an early warning system into clinical practice. Front. Digit. Health 4, 932123 (2022).

Article 
PubMed 
PubMed Central 

Google Scholar
 

Aakre, C. A., Kitson, J. E., Li, M. & Herasevich, V. Iterative user interface design for automated sequential organ failure assessment score calculator in sepsis detection. JMIR Hum. Factors 4, e14 (2017).

Article 
PubMed 
PubMed Central 

Google Scholar
 

Brajer, N. et al. Prospective and external evaluation of a machine learning model to predict in-hospital mortality of adults at time of admission. JAMA Netw. Open 3, e1920733 (2020).

Article 
PubMed 
PubMed Central 

Google Scholar
 

Nemeth, C. et al. TCCC decision support with machine learning prediction of hemorrhage risk, shock probability. Mil. Med. 188, 659–665 (2023).

Article 
PubMed 

Google Scholar
 

Shelov, E. et al. Design and implementation of a pediatric ICU acuity scoring tool as clinical decision support. Appl. Clin. Inform. 9, 576–587 (2018).

Article 
PubMed 
PubMed Central 

Google Scholar
 

Bedoya, A. D. et al. Machine learning for early detection of sepsis: an internal and temporal validation study. JAMIA Open 3, 252–260 (2020).

Article 
PubMed 
PubMed Central 

Google Scholar
 

Artificial Intelligence/Machine Learning-enabled Working Group. Good Machine Learning Practice for medical device development: guiding principles. FDA https://www.fda.gov/medical-devices/software-medical-device-samd/good-machine-learning-practice-medical-device-development-guiding-principles (2025).

DECIDE-AI Steering Group. DECIDE-AI: new reporting guidelines to bridge the development-to-implementation gap in clinical artificial intelligence. Nat. Med. 27, 186–187 (2021).

Article 

Google Scholar
 

Rivera, S. C. et al. Guidelines for clinical trial protocols for interventions involving artificial intelligence: the SPIRIT-AI extension. Lancet Digit. Health 2, e549–e560 (2020).

Article 

Google Scholar
 

Liu, X. et al. Reporting guidelines for clinical trial reports for interventions involving artificial intelligence: the CONSORT-AI extension. Lancet Digit. Health 2, e537–e548 (2020).

Article 
PubMed 
PubMed Central 

Google Scholar
 

Moher, D. Guidelines for reporting health care research: advancing the clarity and transparency of scientific reporting. Can. J. Anaesth. 56, 96–101 (2009).

Article 
PubMed 

Google Scholar
 

Evidence standards framework for digital health technologies. Section C: evidence standards tables. NICE https://www.nice.org.uk/corporate/ecd7/chapter/section-c-evidence-standards-tables (2018).

Gaube, S. et al. Do as AI say: susceptibility in deployment of clinical decision-aids. NPJ Digit. Med. 4, 31 (2021).

Article 
PubMed 
PubMed Central 

Google Scholar
 

Chromik, M., Eiband, M., Buchner, F., Krüger, A. & Butz, A. I think I get your point, AI! The illusion of explanatory depth in explainable AI. In 26th International Conference on Intelligent User Interfaces 307–317 (Association for Computing Machinery, 2021); https://doi.org/10.1145/3397481.3450644

Felmingham, C. M. et al. The importance of incorporating human factors in the design and implementation of artificial intelligence for skin cancer diagnosis in the real world. Am. J. Clin. Dermatol. 22, 233–242 (2021).

Article 
PubMed 

Google Scholar
 

Tikhomirov, L. et al. Medical artificial intelligence for clinicians: the lost cognitive perspective. Lancet Digit. Health 6, e589–e594 (2024).

Article 
CAS 
PubMed 

Google Scholar
 

Park, Y. et al. Evaluating artificial intelligence in medicine: phases of clinical research. JAMIA Open 3, 326–331 (2020).

Article 
PubMed 
PubMed Central 

Google Scholar
 

Lemmon, J. et al. Evaluation of feature selection methods for preserving machine learning performance in the presence of temporal dataset shift in clinical medicine. Methods Inf. Med. 62, 60–70 (2023).

Article 
PubMed 

Google Scholar
 

Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17, 195 (2019).

Article 
PubMed 
PubMed Central 

Google Scholar
 

Finlayson, S. G. et al. The clinician and dataset shift in artificial intelligence. N. Engl. J. Med. 385, 283–286 (2021).

Article 
PubMed 
PubMed Central 

Google Scholar
 

Badgeley, M. A. et al. Deep learning predicts hip fracture using confounding patient and healthcare variables. NPJ Digit. Med. 2, 31 (2019).

Article 
PubMed 
PubMed Central 

Google Scholar
 

Bozkurt, S. et al. Reporting of demographic data and representativeness in machine learning models using electronic health records. J. Am. Med. Inform. Assoc. 27, 1878–1884 (2020).

Article 
PubMed 
PubMed Central 

Google Scholar
 

Plana, D. et al. Randomized clinical trials of machine learning interventions in health care: a systematic review. JAMA Netw. Open 5, e2233946 (2022).

Article 
PubMed 
PubMed Central 

Google Scholar
 

Obermeyer, Z., Powers, B., Vogeli, C. & Mullainathan, S. Dissecting racial bias in an algorithm used to manage the health of populations. Science 366, 447–453 (2019).

Article 
CAS 
PubMed 

Google Scholar
 

McCradden, M. D., Joshi, S., Mazwi, M. & Anderson, J. A. Ethical limitations of algorithmic fairness solutions in health care machine learning. Lancet Digit. Health 2, e221–e223 (2020).

Article 
PubMed 

Google Scholar
 

McCradden, M. et al. What’s fair is… fair? Presenting JustEFAB, an ethical framework for operationalizing medical ethics and social justice in the integration of clinical machine learning: JustEFAB. In Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency 1505–1519 (Association for Computing Machinery, 2023); https://dl.acm.org/doi/abs/10.1145/3593013.3594096

Gichoya, J. W. et al. AI recognition of patient race in medical imaging: a modelling study. Lancet Digit. Health 4, e406–e414 (2022).

Article 
CAS 
PubMed 
PubMed Central 

Google Scholar
 

Arora, A. et al. The value of standards for health datasets in artificial intelligence-based applications. Nat. Med. 29, 2929–2938 (2023).

Article 
CAS 
PubMed 
PubMed Central 

Google Scholar
 

McCradden, M. D. et al. What makes a ‘good’ decision with artificial intelligence? A grounded theory study in paediatric care. BMJ Evid. Based Med. 30, 183–193 (2025).

Article 
PubMed 
PubMed Central 

Google Scholar
 

Assadi, A. et al. An integration engineering framework for machine learning in healthcare. Front. Digit. Health 4, 932411 (2022).

Article 
PubMed 
PubMed Central 

Google Scholar
 

Militello, L. G. et al. Using human factors methods to mitigate bias in artificial intelligence-based clinical decision support. J. Am. Med. Inform. Assoc. 32, 398–403 (2025).

Article 
PubMed 
PubMed Central 

Google Scholar
 

Campbell, N. C. et al. Designing and evaluating complex interventions to improve health care. BMJ 334, 455–459 (2007).

Article 
PubMed 
PubMed Central 

Google Scholar
 

Aromataris, E. et al. (eds) JBI Manual for Evidence Synthesis (JBI, 2024); https://synthesismanual.jbi.global

Tricco, A. et al. PRISMA extension for scoping reviews (PRISMAScR): checklist and explanation. Ann. Intern. Med. 169, 467–473 (2018).

Article 
PubMed 

Google Scholar
 

Breiman, L. Statistical modeling: the two cultures (with comments and a rejoinder by the author). Stat. Sci. 16, 199–231 (2001).

Article 

Google Scholar
 

Covidence Systematic Review Software https://www.covidence.org (Veritas Health Innovation, 2025).

Abràmoff, M. D., Lavin, P. T., Birch, M., Shah, N. & Folk, J. C. Pivotal trial of an autonomous AI-based diagnostic system for detection of diabetic retinopathy in primary care offices. NPJ Digit. Med. 1, 39 (2018).

Article 
PubMed 
PubMed Central 

Google Scholar
 

Tonekaboni, S. et al. How to validate machine learning models prior to deployment: silent trial protocol for evaluation of real-time models at ICU. In Proceedings of the Conference on Health, Inference, and Learning Vol. 174 (eds Flores, G. et al.) 169–182 (PMLR, 2022).

Sendak, M. et al. “The human body is a black box”: supporting clinical decision-making with deep learning. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency 99–109 (Association for Computing Machinery, 2020); https://doi.org/10.1145/3351095.3372827

Balagopalan, A. et al. Machine learning for healthcare that matters: reorienting from technical novelty to equitable impact. PLOS Digit. Health 3, e0000474 (2024).

Article 
PubMed 
PubMed Central 

Google Scholar
 

Papoutsi, C., Wherton, J., Shaw, S., Morrison, C. & Greenhalgh, T. Putting the social back into sociotechnical: case studies of co-design in digital health. J. Am. Med. Inform. Assoc. 28, 284–293 (2021).

Article 
PubMed 
PubMed Central 

Google Scholar
 

Alrajhi, A. A. et al. Data-driven prediction for COVID-19 severity in hospitalized patients. Int. J. Environ. Res. Public Health 19, 2958 (2022).

Article 
CAS 
PubMed 
PubMed Central 

Google Scholar
 

Aydın, E. et al. Diagnostic accuracy of a machine learning-derived appendicitis score in children: a multicenter validation study. Children (Basel) 12, 937 (2025).

PubMed 
PubMed Central 

Google Scholar
 

Bachelot, G. et al. A machine learning approach for the prediction of testicular sperm extraction in nonobstructive azoospermia: algorithm development and validation study. J. Med. Internet Res. 25, e44047 (2023).

Article 
PubMed 
PubMed Central 

Google Scholar
 

Berg, W. A. et al. Toward AI-supported US triage of women with palpable breast lumps in a low-resource setting. Radiology 307, e223351 (2023).

Article 
PubMed 
PubMed Central 

Google Scholar
 

Butler, H. J. et al. Development of high-throughput ATR-FTIR technology for rapid triage of brain cancer. Nat. Commun. 10, 4501 (2019).

Article 
PubMed 
PubMed Central 

Google Scholar
 

Campanella, G. et al. Real-world deployment of a fine-tuned pathology foundation model for lung cancer biomarker detection. Nat. Med. 31, 3002–3010 (2025).

Article 
CAS 
PubMed 
PubMed Central 

Google Scholar
 

Chen, Y. et al. Endoscopic ultrasound-based radiomics for predicting pathologic upgrade in esophageal low-grade intraepithelial neoplasia. Surg. Endosc. 39, 2239–2249 (2025).

Article 
PubMed 

Google Scholar
 

Cheng, Y. et al. Two-year hypertension incidence risk prediction in populations in the desert regions of northwest China: prospective cohort study. J. Med. Internet Res. 27, e68442 (2025).

Article 
PubMed 
PubMed Central 

Google Scholar
 

Chiang, D.-H., Jiang, Z., Tian, C. & Wang, C.-Y. Development and validation of a dynamic early warning system with time-varying machine learning models for predicting hemodynamic instability in critical care: a multicohort study. Crit. Care 29, 318 (2025).

Article 
PubMed 
PubMed Central 

Google Scholar
 

Chufal, K. S. et al. Machine learning model for predicting DIBH non-eligibility in left-sided breast cancer radiotherapy: development, validation and clinical impact analysis. Radiother. Oncol. 205, 110764 (2025).

Article 
PubMed 

Google Scholar
 

Coley, R. Y., Walker, R. L., Cruz, M., Simon, G. E. & Shortreed, S. M. Clinical risk prediction models and informative cluster size: assessing the performance of a suicide risk prediction algorithm. Biom. J. 63, 1375–1388 (2021).

Article 
PubMed 
PubMed Central 

Google Scholar
 

Corbin, C. K. et al. DEPLOYR: a technical framework for deploying custom real-time machine learning models into the electronic medical record. J. Am. Med. Inform. Assoc. 30, 1532–1542 (2023).

Article 
PubMed 
PubMed Central 

Google Scholar
 

Dave, C. et al. Prospective real-time validation of a lung ultrasound deep learning model in the ICU. Crit. Care Med. 51, 301–309 (2023).

Article 
PubMed 

Google Scholar
 

El Moheb, M. et al. An open-architecture AI model for CPT coding in breast surgery: development, validation, and prospective testing. Ann. Surg. 282, 439–448 (2025).

Article 
PubMed 

Google Scholar
 

Faqar-Uz-Zaman, S. F. et al. The diagnostic efficacy of an app-based diagnostic health care application in the emergency room: eRadaR-trial. A prospective, double-blinded, observational study. Ann. Surg. 276, 935–942 (2022).

Article 
PubMed 

Google Scholar
 

Felmingham, C. et al. Improving Skin cancer Management with ARTificial Intelligence (SMARTI): protocol for a preintervention/postintervention trial of an artificial intelligence system used as a diagnostic aid for skin cancer management in a specialist dermatology setting. BMJ Open 12, e050203 (2022).

Article 
PubMed 
PubMed Central 

Google Scholar
 

Feng, W. et al. Identifying small thymomas from other asymptomatic anterior mediastinal nodules based on CT images using logistic regression. Front. Oncol. 15, 1590710 (2025).

Article 
PubMed 
PubMed Central 

Google Scholar
 

Hanley, D. et al. Emergency department triage of traumatic head injury using a brain electrical activity biomarker: a multisite prospective observational validation trial. Acad. Emerg. Med. 24, 617–627 (2017).

Article 
PubMed 

Google Scholar
 

Hoang, M. T. et al. Evaluating the utility of a clinical sepsis AI tool in emergency waiting rooms: a preliminary silent trial. Stud. Health Technol. Inform. 329, 307–311 (2025).

PubMed 

Google Scholar
 

Im, H. et al. Design and clinical validation of a point-of-care device for the diagnosis of lymphoma via contrast-enhanced microholography and machine learning. Nat. Biomed. Eng. 2, 666–674 (2018).

Article 
PubMed 
PubMed Central 

Google Scholar
 

Korfiatis, P. et al. Automated artificial intelligence model trained on a large data set can detect pancreas cancer on diagnostic computed tomography scans as well as visually occult preinvasive cancer on prediagnostic computed tomography scans. Gastroenterology 165, 1533–1546 (2023).

Article 
PubMed 
PubMed Central 

Google Scholar
 

Kramer, D. et al. Machine learning-based prediction of malnutrition in surgical in-patients: a validation pilot study. Stud. Health Technol. Inform. 313, 156–157 (2024).

PubMed 

Google Scholar
 

Kwong, J. C. C. et al. The silent trial—the bridge between bench-to-bedside clinical AI applications. Front. Digit. Health 4, 929508 (2022).

Article 
PubMed 
PubMed Central 

Google Scholar
 

Liu, R. et al. Development and prospective validation of postoperative pain prediction from preoperative EHR data using attention-based set embeddings. NPJ Digit. Med. 6, 209 (2023).

Article 
CAS 
PubMed 
PubMed Central 

Google Scholar
 

Liu, Y. et al. Validation of an established TW3 artificial intelligence bone age assessment system: a prospective, multicenter, confirmatory study. Quant. Imaging Med. Surg. 14, 144–159 (2024).

Article 
PubMed 

Google Scholar
 

Luo, H. et al. Real-time artificial intelligence for detection of upper gastrointestinal cancer by endoscopy: a multicentre, case–control, diagnostic study. Lancet Oncol. 20, 1645–1654 (2019).

Article 
CAS 
PubMed 

Google Scholar
 

Lupei, M. I. et al. A 12-hospital prospective evaluation of a clinical decision support prognostic algorithm based on logistic regression as a form of machine learning to facilitate decision making for patients with suspected COVID-19. PLoS ONE 17, e0262193 (2022).

Article 
CAS 
PubMed 
PubMed Central 

Google Scholar
 

Mahajan, A. et al. Development and validation of a machine learning model to identify patients before surgery at high risk for postoperative adverse events. JAMA Netw. Open 6, e2322285 (2023).

Article 
PubMed 
PubMed Central 

Google Scholar
 

Major, V. J. & Aphinyanaphongs, Y. Development, implementation, and prospective validation of a model to predict 60-day end-of-life in hospitalized adults upon admission at three sites. BMC Med. Inform. Decis. Mak. 20, 214 (2020).

Article 
PubMed 
PubMed Central 

Google Scholar
 

Miró Catalina, Q. et al. Real-world testing of an artificial intelligence algorithm for the analysis of chest X-rays in primary care settings. Sci. Rep. 14, 5199 (2024).

Article 
PubMed 
PubMed Central 

Google Scholar
 

O’Brien, C. et al. Development, implementation, and evaluation of an in-hospital optimized early warning score for patient deterioration. MDM Policy Pract. 5, 2381468319899663 (2020).

Article 
PubMed 
PubMed Central 

Google Scholar
 

Pan, Y. et al. An interpretable machine learning model based on optimal feature selection for identifying CT abnormalities in patients with mild traumatic brain injury. EClinicalMedicine 82, 103192 (2025).

Article 
PubMed 
PubMed Central 

Google Scholar
 

Pyrros, A. et al. Opportunistic detection of type 2 diabetes using deep learning from frontal chest radiographs. Nat. Commun. 14, 4039 (2023).

Article 
CAS 
PubMed 
PubMed Central 

Google Scholar
 

Qian, Y.-F., Zhou, J.-J., Shi, S.-L. & Guo, W.-L. Predictive model integrating deep learning and clinical features based on ultrasound imaging data for surgical intervention in intussusception in children younger than 8 months. BMJ Open 15, e097575 (2025).

Article 
PubMed 
PubMed Central 

Google Scholar
 

Rawson, T. M. et al. Supervised machine learning to support the diagnosis of bacterial infection in the context of COVID-19. JAC Antimicrob. Resist. 3, dlab002 (2021).

Article 
PubMed 
PubMed Central 

Google Scholar
 

Ren, L.-J. et al. Artificial intelligence assisted identification of newborn auricular deformities via smartphone application. EClinicalMedicine 81, 103124 (2025).

Article 
PubMed 
PubMed Central 

Google Scholar
 

Schinkel, M. et al. Diagnostic stewardship for blood cultures in the emergency department: a multicenter validation and prospective evaluation of a machine learning prediction tool. EBioMedicine 82, 104176 (2022).

Article 
PubMed 
PubMed Central 

Google Scholar
 

Shah, P. K. et al. A simulated prospective evaluation of a deep learning model for real-time prediction of clinical deterioration among ward patients. Crit. Care Med. 49, 1312–1321 (2021).

Article 
PubMed 
PubMed Central 

Google Scholar
 

Shamout, F. E. et al. An artificial intelligence system for predicting the deterioration of COVID-19 patients in the emergency department. NPJ Digit. Med. 4, 80 (2021).

Article 
PubMed 
PubMed Central 

Google Scholar
 

Shi, Y.-H. et al. Construction and validation of machine learning-based predictive model for colorectal polyp recurrence one year after endoscopic mucosal resection. World J. Gastroenterol. 31, 102387 (2025).

Article 
PubMed 
PubMed Central 

Google Scholar
 

Smith, S. J., Bradley, S. A., Walker-Stabeler, K. & Siafakas, M. A prospective analysis of screen-detected cancers recalled and not recalled by artificial intelligence. J. Breast Imaging 6, 378–387 (2024).

Article 
PubMed 

Google Scholar
 

Stamatopoulos, N. et al. Temporal and external validation of the algorithm predicting first trimester outcome of a viable pregnancy. Aust. N. Z. J. Obstet. Gynaecol. 65, 128–134 (2025).

Article 
PubMed 

Google Scholar
 

Swinnerton, K. et al. Leveraging near-real-time patient and population data to incorporate fluctuating risk of severe COVID-19: development and prospective validation of a personalised risk prediction tool. EClinicalMedicine 81, 103114 (2025).

Article 
PubMed 
PubMed Central 

Google Scholar
 

Tariq, A., Patel, B. N., Sensakovic, W. F., Fahrenholtz, S. J. & Banerjee, I. Opportunistic screening for low bone density using abdominopelvic computed tomography scans. Med. Phys. 50, 4296–4307 (2023).

Article 
CAS 
PubMed 

Google Scholar
 

Titano, J. J. et al. Automated deep-neural-network surveillance of cranial images for acute neurologic events. Nat. Med. 24, 1337–1341 (2018).

Article 
CAS 
PubMed 

Google Scholar
 

Vaid, A. et al. Machine learning to predict mortality and critical events in a cohort of patients with COVID-19 in New York City: model development and validation. J. Med. Internet Res. 22, e24018 (2020).

Article 
PubMed 
PubMed Central 

Google Scholar
 

Wall, P. D. H., Hirata, E., Morin, O., Valdes, G. & Witztum, A. Prospective clinical validation of virtual patient-specific quality assurance of volumetric modulated arc therapy radiation therapy plans. Int. J. Radiat. Oncol. Biol. Phys. 113, 1091–1102 (2022).

Article 
PubMed 

Google Scholar
 

Wan, C.-F. et al. Radiomics of multimodal ultrasound for early prediction of pathologic complete response to neoadjuvant chemotherapy in breast cancer. Acad. Radiol. 32, 1861–1873 (2025).

Article 
PubMed 

Google Scholar
 

Wang, X. et al. Prediction of the 1-year risk of incident lung cancer: prospective study using electronic health records from the state of Maine. J. Med. Internet Res. 21, e13260 (2019).

Article 
PubMed 
PubMed Central 

Google Scholar
 

Wang, L., Wu, H., Wu, C., Shu, L. & Zhou, D. A deep-learning system integrating electrocardiograms and laboratory indicators for diagnosing acute aortic dissection and acute myocardial infarction. Int. J. Cardiol. 423, 133008 (2025).

Article 
PubMed 

Google Scholar
 

Wissel, B. D. et al. Prospective validation of a machine learning model that uses provider notes to identify candidates for resective epilepsy surgery. Epilepsia 61, 39–48 (2020).

Article 
PubMed 

Google Scholar
 

Xie, Z. et al. Enhanced diagnosis of axial spondyloarthritis using machine learning with sacroiliac joint MRI: a multicenter study. Insights Imaging 16, 91 (2025).

Article 
PubMed 
PubMed Central 

Google Scholar
 

Ye, C. et al. A real-time early warning system for monitoring inpatient mortality risk: prospective study using electronic medical record data. J. Med. Internet Res. 21, e13719 (2019).

Article 
PubMed 
PubMed Central 

Google Scholar
 

Ye, J.-Z. et al. Nomogram for prediction of the International Study Group of Liver Surgery (ISGLS) grade B/C posthepatectomy liver failure in HBV-related hepatocellular carcinoma patients: an external validation and prospective application study. BMC Cancer 20, 1036 (2020).

Article 
CAS 
PubMed 
PubMed Central 

Google Scholar
 

Yu, S. C. et al. Sepsis prediction for the general ward setting. Front. Digit. Health 4, 848599 (2022).

Article 
PubMed 
PubMed Central 

Google Scholar
 

Zhang, Z. et al. Development of an MRI based artificial intelligence model for the identification of underlying atrial fibrillation after ischemic stroke: a multicenter proof-of-concept analysis. EClinicalMedicine 81, 103118 (2025).

Article 
PubMed 
PubMed Central 

Google Scholar
 

Escalé-Besa, A. et al. Using artificial intelligence as a diagnostic decision support tool in skin disease: protocol for an observational prospective cohort study. JMIR Res. Protoc. 11, e37531 (2022).

Article 
PubMed 
PubMed Central 

Google Scholar
 

Faqar-Uz-Zaman, S. F. et al. Study protocol for a prospective, double-blinded, observational study investigating the diagnostic accuracy of an app-based diagnostic health care application in an emergency room setting: the eRadaR trial. BMJ Open 11, e041396 (2021).

Article 
PubMed 
PubMed Central 

Google Scholar
 

Felmingham, C. et al. Improving skin cancer management with ARTificial intelligence: a pre–post intervention trial of an artificial intelligence system used as a diagnostic aid for skin cancer management in a real-world specialist dermatology setting. J. Am. Acad. Dermatol. 88, 1138–1142 (2023).

Article 
PubMed 

Google Scholar
 

Miró Catalina, Q., Fuster-Casanovas, A., Solé-Casals, J. & Vidal-Alaball, J. Developing an artificial intelligence model for reading chest x-rays: protocol for a prospective validation study. JMIR Res. Protoc. 11, e39536 (2022).

Article 
PubMed 
PubMed Central 

Google Scholar
 

Sheppard, J. P., Martin, U., Gill, P., Stevens, R. & McManus, R. J. Prospective Register Of patients undergoing repeated OFfice and Ambulatory Blood Pressure Monitoring (PROOF-ABPM): protocol for an observational cohort study. BMJ Open 6, e012607 (2016).

Article 
PubMed 
PubMed Central 

Google Scholar