Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616, 259–265 (2023).

Article 
CAS 
PubMed 

Google Scholar
 

Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature 619, 357–362 (2023).

Article 
CAS 
PubMed 
PubMed Central 

Google Scholar
 

Radford, A. et al. Learning transferable visual models from natural language supervision. In Proc. 38th International Conference on Machine Learning (eds Meila, M. & Zhang, T.), Vol. 139 of Proceedings of Machine Learning Research 8748–8763 (PMLR, 2021).

Ramesh, A. et al. Zero-shot text-to-image generation. In Proc. 38th International Conference on Machine Learning (eds Meila, M. & Zhang, T.), Vol. 139 of Proceedings of Machine Learning Research 8821–8831 (PMLR, 2021).

Alayrac, J.-B. et al. Flamingo: a visual language model for few-shot learning. In Advances in Neural Information Processing Systems, Vol. 35 (eds Koyejo, S. et al.) 23716–23736 (Curran Associates, 2022).

Dreisbach, J. N. & Lukin, R. Where have all the neuroradiologists gone? AJNR Am. J. Neuroradiol. 22, 1636–1638 (2001).

CAS 
PubMed 
PubMed Central 

Google Scholar
 

Rula, E. Y. Radiology workforce shortage and growing demand: something has to give. https://www.acr.org/Practice-Management-Quality-Informatics/ACR-Bulletin/Articles/July-2024/Radiology-Workforce-Shortage-and-Growing-Demand-Something-Has-to-Give (2024).

Christensen, E. W. et al. Association of state share of nonphysician practitioners with diagnostic imaging ordering among emergency department visits for medicare beneficiaries. JAMA Netw. Open 5, e2241297 (2022).

Article 
PubMed 
PubMed Central 

Google Scholar
 

Fawzy, N. A. et al. Incidence and factors associated with burnout in radiologists: a systematic review. Eur. J. Radiol. Open 11, 100530 (2023).

Article 

Google Scholar
 

Krupinski, E. A., Berbaum, K. S., Caldwell, R. T., Schartz, K. M. & Kim, J. Long radiology workdays reduce detection and accommodation accuracy. J. Am. Coll. Radiol. 7, 698–704 (2010).

Article 
PubMed 
PubMed Central 

Google Scholar
 

Ivanovic, V. et al. Neuroradiology diagnostic errors at a tertiary academic centre: effect of participation in tumour boards and physician experience. Clin. Radiol. 77, 607–612 (2022).

Article 
CAS 
PubMed 

Google Scholar
 

Ivanovic, V. et al. Factors associated with neuroradiology diagnostic errors at a large tertiary-care academic medical center: a case-control study. Am. J. Roentgenol. 221, 355–362 (2023).

Article 

Google Scholar
 

O’Neill, T. J. et al. Active reprioritization of the reading worklist using artificial intelligence has a beneficial effect on the turnaround time for interpretation of head CT with intracranial hemorrhage. Radiol. Artif. Intell. 3, e200024 (2021).

Article 
PubMed 

Google Scholar
 

Shin, H. J., Han, K., Ryu, L. & Kim, E.-K. The impact of artificial intelligence on the reading times of radiologists for chest radiographs. npj Digit. Med. 6, 82 (2023).

Article 
PubMed 
PubMed Central 

Google Scholar
 

Alexander, R. et al. Mandating limits on workload, duty, and speed in radiology. Radiology 304, 274–282 (2022).

Article 
PubMed 

Google Scholar
 

DeBenedectis, C. M. et al. Health care disparities in radiology—a review of the current literature. J. Am. Coll. Radiol. 19, 101–111 (2022).

Article 
PubMed 

Google Scholar
 

Gauriau, R. et al. A deep learning-based model for detecting abnormalities on brain MR images for triaging: preliminary results from a multisite experience. Radiol. Artif. Intell. 3, e200184 (2021).

Article 
PubMed 
PubMed Central 

Google Scholar
 

Barbano, C. A., Brunello, M., Dufumier, B. & Grangetto, M. Anatomical foundation models for brain MRIs. Pattern Recognition Letters 199, 178–184 (2026).

Article 

Google Scholar
 

OpenAI. GPT-4 technical report. Preprint at https://arxiv.org/pdf/2303.08774 (2023).

Rombach, R., Blattmann, A., Lorenz, D., Esser, P. & Ommer, B. High-resolution image synthesis with latent diffusion models. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 10684–10695 (2022).

Dosovitskiy, A. et al. An image is worth 16 × 16 words: transformers for image recognition at scale. In 9th International Conference on Learning Representations (OpenReview.net, 2021).

Darcet, T., Oquab, M., Mairal, J. & Bojanowski, P. Vision transformers need registers. In The Twelfth International Conference on Learning Representations (eds Kim, B. et al.) 2632–2652 (2024).

Zhang, K. et al. Clinically applicable AI system for accurate diagnosis, quantitative measurements, and prognosis of COVID-19 pneumonia using computed tomography. Cell 181, 1423–1433.e11 (2020).

Article 
PubMed 
PubMed Central 

Google Scholar
 

Tiu, E. et al. Expert-level detection of pathologies from unannotated chest X-ray images via self-supervised learning. Nat. Biomed. Eng. 6, 1399–1406 (2022).

Article 
PubMed 
PubMed Central 

Google Scholar
 

Bannur, S. et al. Learning to exploit temporal structure for biomedical vision-language processing. In Proc. IEEE Computer Society Conference on Computer Vision and Pattern Recognition 15016–15027 (2023).

Wang, Y. et al. Enhancing vision-language models for medical imaging: bridging the 3D gap with innovative slice selection. Neural Inf. Process. Syst. 37, 99947–99964 (2024).


Google Scholar
 

Chen, R. J. et al. Towards a general-purpose foundation model for computational pathology. Nat. Med. 30, 850–862 (2024).

Radford, A. et al. Language models are unsupervised multitask learners. OpenAI blog 1, 9 (2019).

Liu, H., Li, C., Wu, Q. & Lee, Y. J. Visual instruction tuning. In Proc. 37th International Conference on Neural Information Processing Systems 34892–34916 (2023).

Eslami, S., Meinel, C. & De Melo, G. PubMedCLIP: how much does CLIP benefit visual question answering in the medical domain? In Findings of the Association for Computational Linguistics: EACL 2023 (eds Vlachos, A. & Augenstein, I.) 1151–1163 (ACL, 2023).

Zhang, S. et al. BiomedCLIP: a multimodal biomedical foundation model pretrained from fifteen million scientific image-text pairs. Preprint at https://arxiv.org/abs/2303.00915 (2023).

Moor, M. et al. Med-Flamingo: a multimodal medical few-shot learner. In Proc. 3rd Machine Learning for Health Symposium (eds Hegselmann, S. et al.) 353–367 (PMLR, 2023).

Kaplan, J. et al. Scaling laws for neural language models. Preprint at https://arxiv.org/abs/2001.08361 (2020).

Guo, C., Pleiss, G., Sun, Y. & Weinberger, K. Q. On calibration of modern neural networks. In Proc. 34th International Conference on Machine Learning 1321– 1330 (PMLR, 2017).

Di Martino, A. et al. The autism brain imaging data exchange: towards a large-scale evaluation of the intrinsic brain architecture in autism. Mol. Psychiatry 19, 659–667 (2014).

Article 
PubMed 

Google Scholar
 

Petersen, R. C. et al. Alzheimer’s disease neuroimaging initiative (ADNI) clinical characterization. Neurology 74, 201–209 (2010).

Article 
PubMed 
PubMed Central 

Google Scholar
 

Marcus, D. S. et al. Open Access Series of Imaging Studies (OASIS): cross-sectional MRI data in young, middle aged, nondemented, and demented older adults. J. Cogn. Neurosci. 19, 1498–1507 (2007).

Article 
PubMed 

Google Scholar
 

Lee, J. et al. Deep learning-based brain age prediction in normal aging and dementia. Nat. Aging 2, 412–424 (2022).

Article 
PubMed 
PubMed Central 

Google Scholar
 

Bashyam, V. M. et al. MRI signatures of brain age and disease over the lifespan based on a deep brain network and 14 468 individuals worldwide. Brain 143, 2312–2324 (2020).

Article 
PubMed 
PubMed Central 

Google Scholar
 

Baid, U. et al. The RSNA-ASNR-MICCAI BraTS 2021 benchmark on brain tumor segmentation and radiogenomic classification. Preprint at https://arxiv.org/abs/2107.02314 (2021).

Rudie, J. D. et al. The University of California San Francisco Brain Metastases Stereotactic Radiosurgery (UCSF-BMSR) MRI dataset. Radiol. Artif. Intell. 6, e230126 (2024).

Article 
PubMed 
PubMed Central 

Google Scholar
 

Oermann, E. et al. Longitudinal deep neural networks for assessing metastatic brain cancer on a massive open benchmark. Nat. Commun. 15, 8170 (2024).

Liu, C.-F. et al. A large public dataset of annotated clinical MRIs and metadata of patients with acute stroke. Sci. Data 10, 548 (2023).

Article 
PubMed 
PubMed Central 

Google Scholar
 

Wiens, J. et al. Do no harm: a roadmap for responsible machine learning for health care. Nat. Med. 25, 1337–1340 (2019).

Article 
CAS 
PubMed 

Google Scholar
 

Ribeiro, M. T., Singh, S. & Guestrin, C. ‘Why should I trust you?’ Explaining the predictions of any classifier. In Proc. 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (eds Krishnapuram, B. et al.) 1135–1144 (2016).

Smith, J. S. et al. Role of extent of resection in the long-term outcome of low-grade hemispheric gliomas. J. Clin. Oncol. 26, 1338–1345 (2008).

Article 
PubMed 

Google Scholar
 

Waite, S., Scott, J. & Colombo, D. Narrowing the gap: imaging disparities in radiology. Radiology 299, 27–35 (2021).

Article 
PubMed 

Google Scholar
 

Barocas, S., Hardt, M. & Narayanan, A. Fairness and Machine Learning: Limitations and Opportunities (MIT Press, 2023).

Rajpurkar, P. & Topol, E. J. A clinical certification pathway for generalist medical AI systems. Lancet 405, 20 (2025).

Article 
PubMed 

Google Scholar
 

Ivanovic, V. et al. Impact of shift volume on neuroradiology diagnostic errors at a large tertiary academic center. Acad. Radiol. 30, 1584–1588 (2023).

Article 
PubMed 

Google Scholar
 

Babiarz, L. S. & Yousem, D. M. Quality control in neuroradiology: discrepancies in image interpretation among academic neuroradiologists. AJNR Am. J. Neuroradiol. 33, 37–42 (2012).

Article 
CAS 
PubMed 
PubMed Central 

Google Scholar
 

Wu, M. Z., McInnes, M. D. F., Macdonald, D. B., Kielar, A. Z. & Duigenan, S. CT in adults: systematic review and meta-analysis of interpretation discrepancy rates. Radiology 270, 717–735 (2014).

Article 
PubMed 

Google Scholar
 

Azizi, S. et al. Robust and data-efficient generalization of self-supervised machine learning for diagnostic imaging. Nat. Biomed. Eng. 7, 756–779 (2023).

Article 
PubMed 

Google Scholar
 

Moor, M. et al. Med-Flamingo: a multimodal medical few-shot learner. In Machine Learning for Health (ML4H) 353–367 (PMLR, 2023).

Singhal, K. et al. Large language models encode clinical knowledge. Nature 620, 172–180 (2023).

Article 
CAS 
PubMed 
PubMed Central 

Google Scholar
 

Blankemeier, L. et al. Merlin: a vision language foundation model for 3D computed tomography. Preprint at https://www.researchsquare.com/article/rs-4546309/v1 (2024).

Elliott, L. T. et al. Genome-wide association studies of brain imaging phenotypes in UK Biobank. Nature 562, 210–216 (2018).

Article 
CAS 
PubMed 
PubMed Central 

Google Scholar
 

Kickingereder, P. et al. Automated quantitative tumour response assessment of MRI in neuro-oncology with artificial neural networks: a multicentre, retrospective study. Lancet Oncol. 20, 728–740 (2019).

Article 
PubMed 

Google Scholar
 

Wood, D. A. et al. A self-supervised text-vision framework for automated brain abnormality detection. Preprint at https://arxiv.org/abs/2405.02782 (2024).

Ghosh, S., Poynton, C. B., Visweswaran, S. & Batmanghelich, K. Mammo-CLIP: a vision language foundation model to enhance data efficiency and robustness in mammography. In Proc. International Conference on Medical Image Computing and Computer-assisted Intervention 632–642 (Springer, 2024).

van den Oord, A., Vinyals, O. & Kavukcuoglu, K. Neural discrete representation learning. In Advances in Neural Information Processing Systems, Vol. 30 (eds Guyon, I. et al.) (2017).

Ramesh, A., Dhariwal, P., Nichol, A., Chu, C. & Chen, M. Hierarchical text-conditional image generation with CLIP latents. Preprint at https://arxiv.org/pdf/2204.06125 (2022).

Brown, T. et al. Language models are few-shot learners. Adv. Neural Inf. Process. Syst. 33, 1877–1901 (2020).


Google Scholar
 

Chien, A. et al. AI-assisted summarization of radiologic reports: evaluating GPT3davinci, BARTcnn, LongT5booksum, LEDbooksum, LEDlegal, and LEDclinical. AJNR Am. J. Neuroradiol. 45, 244–248 (2024).

Article 
PubMed 
PubMed Central 

Google Scholar
 

Ranjit, M., Ganapathy, G., Manuel, R. & Ganu, T. Retrieval augmented chest X-ray report generation using OpenAI GPT models. In Proc. Machine Learning for Healthcare Conference (eds Deshpande, K. et al.) 650–666 (PMLR, 2023).

Adams, L. C. et al. Leveraging GPT-4 for post hoc transformation of free-text radiology reports into structured reporting: a multilingual feasibility study. Radiology 307, e230725 (2023).

Article 
PubMed 

Google Scholar
 

Titano, J. J. et al. Automated deep-neural-network surveillance of cranial images for acute neurologic events. Nat. Med. 24, 1337–1341 (2018).

Article 
CAS 
PubMed 

Google Scholar
 

Vaswani, A. et al. Attention is all you need. In Advances in Neural Information Processing Systems (eds Guyon, I. et al.) Vol. 30 (Curran Associates, Inc., 2017).

Monti, S., Tamayo, P., Mesirov, J. & Golub, T. Consensus clustering: a resampling-based method for class discovery and visualization of gene expression microarray data. Mach. Learn. 52, 91–118 (2003).

Article 

Google Scholar
 

Kondepudi, A. et al. Foundation models for fast, label-free detection of glioma infiltration. Nature 637, 439–445 (2025).

Article 
CAS 
PubMed 

Google Scholar
 

Cheng, J., Wang, Z. & Pollastri, G. A neural network approach to ordinal regression. In 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence) 1279–1284 (2008).

Saratxaga, C. L. et al. MRI deep learning-based solution for Alzheimer’s disease prediction. J. Pers. Med. 11, 902 (2021).

Article 
PubMed 
PubMed Central 

Google Scholar
 

Li, J., Li, D., Savarese, S. & Hoi, S. BLIP-2: bootstrapping language-image pre-training with frozen image encoders and large language models. In Proc. International Conference on Machine Learning 19730–19742 (PMLR, 2023).

Chen, Q. & Hong, Y. MedBLIP: bootstrapping language-image pre-training from 3D medical images and texts. In Proc. Asian Conference on Computer Vision (eds Cho, M. et al.) 2404–2420 (2024).

Liu, H., Li, C., Li, Y. & Lee, Y. J. Improved baselines with visual instruction tuning. In Proceedings IEEE/CVF Conference on Computer Vision and Pattern Recognition 26296–26306 (2024).

Li, C. et al. LLaVA-Med: training a large language-and-vision assistant for biomedicine in one day. In Advances in Neural Information Processing Systems, Vol. 36 (eds Oh, A. et al.) 28541–28564 (Curran Associates, Inc., 2023).

Zhu, C., Wang, T., Zhang, W., Pang, J. & Liu, X. LLaVA-3D: a simple yet effective pathway to empowering LMMs with 3D-awareness. In Proc. IEEE/CVF International Conference on Computer Vision 4295–4305 (2025).

Hardt, M., Price, E. & Srebro, N. Equality of opportunity in supervised learning. In Advances in Neural Information Processing Systems, Vol. 29 (eds Lee, D. et al.) (2016).

Vaidya, A. et al. Demographic bias in misdiagnosis by computational pathology models. Nat. Med. 30, 1174–1190 (2024).

Article 
CAS 
PubMed 

Google Scholar