{"id":519779,"date":"2026-03-12T20:27:23","date_gmt":"2026-03-12T20:27:23","guid":{"rendered":"https:\/\/www.newsbeep.com\/us\/519779\/"},"modified":"2026-03-12T20:27:23","modified_gmt":"2026-03-12T20:27:23","slug":"a-clinical-environment-simulator-for-dynamic-ai-evaluation","status":"publish","type":"post","link":"https:\/\/www.newsbeep.com\/us\/519779\/","title":{"rendered":"A clinical environment simulator for dynamic AI evaluation"},"content":{"rendered":"<p class=\"c-article-references__text\" id=\"ref-CR1\">Goh, E. et al. Large language model influence on diagnostic reasoning: a randomized clinical trial. JAMA Netw. Open 7, e2440969 (2024).<\/p>\n<p class=\"c-article-references__links u-hide-print\"><a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"10.1001\/jamanetworkopen.2024.40969\" data-track-item_id=\"10.1001\/jamanetworkopen.2024.40969\" data-track-value=\"article reference\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1001%2Fjamanetworkopen.2024.40969\" aria-label=\"Article reference 1\" data-doi=\"10.1001\/jamanetworkopen.2024.40969\" target=\"_blank\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-item_id=\"link\" data-track-value=\"pubmed reference\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=39466245\" aria-label=\"PubMed reference 1\" target=\"_blank\">PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-item_id=\"link\" data-track-value=\"pubmed central reference\" data-track-action=\"pubmed central reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC11519755\" aria-label=\"PubMed Central reference 1\" target=\"_blank\">PubMed Central<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" data-track-action=\"google scholar reference\" data-track-value=\"google scholar reference\" data-track-label=\"link\" data-track-item_id=\"link\" rel=\"nofollow noopener\" aria-label=\"Google Scholar reference 1\" href=\"http:\/\/scholar.google.com\/scholar_lookup?&amp;title=Large%20language%20model%20influence%20on%20diagnostic%20reasoning%3A%20a%20randomized%20clinical%20trial&amp;journal=JAMA%20Netw.%20Open&amp;doi=10.1001%2Fjamanetworkopen.2024.40969&amp;volume=7&amp;publication_year=2024&amp;author=Goh%2CE\" target=\"_blank\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR2\">McDuff, D. et al. Towards accurate differential diagnosis with large language models. Nature 642, 451\u2013457 (2025).<\/p>\n<p class=\"c-article-references__links u-hide-print\"><a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"10.1038\/s41586-025-08869-4\" data-track-item_id=\"10.1038\/s41586-025-08869-4\" data-track-value=\"article reference\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1038%2Fs41586-025-08869-4\" aria-label=\"Article reference 2\" data-doi=\"10.1038\/s41586-025-08869-4\" target=\"_blank\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-item_id=\"link\" data-track-value=\"cas reference\" data-track-action=\"cas reference\" href=\"https:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BB2MXhtlGis73J\" aria-label=\"CAS reference 2\" target=\"_blank\">CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-item_id=\"link\" data-track-value=\"pubmed reference\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=40205049\" aria-label=\"PubMed reference 2\" target=\"_blank\">PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-item_id=\"link\" data-track-value=\"pubmed central reference\" data-track-action=\"pubmed central reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC12158753\" aria-label=\"PubMed Central reference 2\" target=\"_blank\">PubMed Central<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" data-track-action=\"google scholar reference\" data-track-value=\"google scholar reference\" data-track-label=\"link\" data-track-item_id=\"link\" rel=\"nofollow noopener\" aria-label=\"Google Scholar reference 2\" href=\"http:\/\/scholar.google.com\/scholar_lookup?&amp;title=Towards%20accurate%20differential%20diagnosis%20with%20large%20language%20models&amp;journal=Nature&amp;doi=10.1038%2Fs41586-025-08869-4&amp;volume=642&amp;pages=451-457&amp;publication_year=2025&amp;author=McDuff%2CD\" target=\"_blank\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR3\">Cabral, S. et al. Clinical reasoning of a generative artificial intelligence model compared with physicians. JAMA Intern. Med. 184, 581\u2013583 (2024).<\/p>\n<p class=\"c-article-references__links u-hide-print\"><a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"10.1001\/jamainternmed.2024.0295\" data-track-item_id=\"10.1001\/jamainternmed.2024.0295\" data-track-value=\"article reference\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1001%2Fjamainternmed.2024.0295\" aria-label=\"Article reference 3\" data-doi=\"10.1001\/jamainternmed.2024.0295\" target=\"_blank\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-item_id=\"link\" data-track-value=\"pubmed reference\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=38557971\" aria-label=\"PubMed reference 3\" target=\"_blank\">PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-item_id=\"link\" data-track-value=\"pubmed central reference\" data-track-action=\"pubmed central reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC10985627\" aria-label=\"PubMed Central reference 3\" target=\"_blank\">PubMed Central<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" data-track-action=\"google scholar reference\" data-track-value=\"google scholar reference\" data-track-label=\"link\" data-track-item_id=\"link\" rel=\"nofollow noopener\" aria-label=\"Google Scholar reference 3\" href=\"http:\/\/scholar.google.com\/scholar_lookup?&amp;title=Clinical%20reasoning%20of%20a%20generative%20artificial%20intelligence%20model%20compared%20with%20physicians&amp;journal=JAMA%20Intern.%20Med.&amp;doi=10.1001%2Fjamainternmed.2024.0295&amp;volume=184&amp;pages=581-583&amp;publication_year=2024&amp;author=Cabral%2CS\" target=\"_blank\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR4\">Goh, E. et al. GPT-4 assistance for improvement of physician performance on patient care tasks: a randomized controlled trial. Nat. Med. 31, 1233\u20131238 (2025).<\/p>\n<p class=\"c-article-references__links u-hide-print\"><a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"10.1038\/s41591-024-03456-y\" data-track-item_id=\"10.1038\/s41591-024-03456-y\" data-track-value=\"article reference\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1038%2Fs41591-024-03456-y\" aria-label=\"Article reference 4\" data-doi=\"10.1038\/s41591-024-03456-y\" target=\"_blank\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-item_id=\"link\" data-track-value=\"cas reference\" data-track-action=\"cas reference\" href=\"https:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BB2MXjtVKjtL8%3D\" aria-label=\"CAS reference 4\" target=\"_blank\">CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-item_id=\"link\" data-track-value=\"pubmed reference\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=39910272\" aria-label=\"PubMed reference 4\" target=\"_blank\">PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-item_id=\"link\" data-track-value=\"pubmed central reference\" data-track-action=\"pubmed central reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC12380382\" aria-label=\"PubMed Central reference 4\" target=\"_blank\">PubMed Central<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" data-track-action=\"google scholar reference\" data-track-value=\"google scholar reference\" data-track-label=\"link\" data-track-item_id=\"link\" rel=\"nofollow noopener\" aria-label=\"Google Scholar reference 4\" href=\"http:\/\/scholar.google.com\/scholar_lookup?&amp;title=GPT-4%20assistance%20for%20improvement%20of%20physician%20performance%20on%20patient%20care%20tasks%3A%20a%20randomized%20controlled%20trial&amp;journal=Nat.%20Med.&amp;doi=10.1038%2Fs41591-024-03456-y&amp;volume=31&amp;pages=1233-1238&amp;publication_year=2025&amp;author=Goh%2CE\" target=\"_blank\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR5\">Tu, T. et al. Towards conversational diagnostic artificial intelligence. Nature 642, 442\u2013450 (2025).<\/p>\n<p class=\"c-article-references__links u-hide-print\"><a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"10.1038\/s41586-025-08866-7\" data-track-item_id=\"10.1038\/s41586-025-08866-7\" data-track-value=\"article reference\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1038%2Fs41586-025-08866-7\" aria-label=\"Article reference 5\" data-doi=\"10.1038\/s41586-025-08866-7\" target=\"_blank\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-item_id=\"link\" data-track-value=\"cas reference\" data-track-action=\"cas reference\" href=\"https:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BB2MXhtlGis73E\" aria-label=\"CAS reference 5\" target=\"_blank\">CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-item_id=\"link\" data-track-value=\"pubmed reference\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=40205050\" aria-label=\"PubMed reference 5\" target=\"_blank\">PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-item_id=\"link\" data-track-value=\"pubmed central reference\" data-track-action=\"pubmed central reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC12158756\" aria-label=\"PubMed Central reference 5\" target=\"_blank\">PubMed Central<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" data-track-action=\"google scholar reference\" data-track-value=\"google scholar reference\" data-track-label=\"link\" data-track-item_id=\"link\" rel=\"nofollow noopener\" aria-label=\"Google Scholar reference 5\" href=\"http:\/\/scholar.google.com\/scholar_lookup?&amp;title=Towards%20conversational%20diagnostic%20artificial%20intelligence&amp;journal=Nature&amp;doi=10.1038%2Fs41586-025-08866-7&amp;volume=642&amp;pages=442-450&amp;publication_year=2025&amp;author=Tu%2CT\" target=\"_blank\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR6\">Gao, S. et al. TxAgent: an AI agent for therapeutic reasoning across a universe of tools. Preprint at <a href=\"https:\/\/doi.org\/10.48550\/arXiv.2503.10970\" data-track=\"click_references\" data-track-action=\"external reference\" data-track-value=\"external reference\" data-track-label=\"10.48550\/arXiv.2503.10970\" rel=\"nofollow noopener\" target=\"_blank\">https:\/\/doi.org\/10.48550\/arXiv.2503.10970<\/a> (2025).<\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR7\">Singhal, K. et al. Large language models encode clinical knowledge. Nature 620, 172\u2013180 (2023).<\/p>\n<p class=\"c-article-references__links u-hide-print\"><a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"10.1038\/s41586-023-06291-2\" data-track-item_id=\"10.1038\/s41586-023-06291-2\" data-track-value=\"article reference\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1038%2Fs41586-023-06291-2\" aria-label=\"Article reference 7\" data-doi=\"10.1038\/s41586-023-06291-2\" target=\"_blank\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-item_id=\"link\" data-track-value=\"cas reference\" data-track-action=\"cas reference\" href=\"https:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BB3sXhsVKju7zP\" aria-label=\"CAS reference 7\" target=\"_blank\">CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-item_id=\"link\" data-track-value=\"pubmed reference\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=37438534\" aria-label=\"PubMed reference 7\" target=\"_blank\">PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-item_id=\"link\" data-track-value=\"pubmed central reference\" data-track-action=\"pubmed central reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC10396962\" aria-label=\"PubMed Central reference 7\" target=\"_blank\">PubMed Central<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" data-track-action=\"google scholar reference\" data-track-value=\"google scholar reference\" data-track-label=\"link\" data-track-item_id=\"link\" rel=\"nofollow noopener\" aria-label=\"Google Scholar reference 7\" href=\"http:\/\/scholar.google.com\/scholar_lookup?&amp;title=Large%20language%20models%20encode%20clinical%20knowledge&amp;journal=Nature&amp;doi=10.1038%2Fs41586-023-06291-2&amp;volume=620&amp;pages=172-180&amp;publication_year=2023&amp;author=Singhal%2CK\" target=\"_blank\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR8\">Sandmann, S. et al. Benchmark evaluation of DeepSeek large language models in clinical decision-making. Nat. Med. 31, 2546\u20132549 (2025).<\/p>\n<p class=\"c-article-references__links u-hide-print\"><a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"10.1038\/s41591-025-03727-2\" data-track-item_id=\"10.1038\/s41591-025-03727-2\" data-track-value=\"article reference\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1038%2Fs41591-025-03727-2\" aria-label=\"Article reference 8\" data-doi=\"10.1038\/s41591-025-03727-2\" target=\"_blank\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-item_id=\"link\" data-track-value=\"cas reference\" data-track-action=\"cas reference\" href=\"https:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BB2MXhvFCjt7nF\" aria-label=\"CAS reference 8\" target=\"_blank\">CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-item_id=\"link\" data-track-value=\"pubmed reference\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=40267970\" aria-label=\"PubMed reference 8\" target=\"_blank\">PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-item_id=\"link\" data-track-value=\"pubmed central reference\" data-track-action=\"pubmed central reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC12353792\" aria-label=\"PubMed Central reference 8\" target=\"_blank\">PubMed Central<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" data-track-action=\"google scholar reference\" data-track-value=\"google scholar reference\" data-track-label=\"link\" data-track-item_id=\"link\" rel=\"nofollow noopener\" aria-label=\"Google Scholar reference 8\" href=\"http:\/\/scholar.google.com\/scholar_lookup?&amp;title=Benchmark%20evaluation%20of%20DeepSeek%20large%20language%20models%20in%20clinical%20decision-making&amp;journal=Nat.%20Med.&amp;doi=10.1038%2Fs41591-025-03727-2&amp;volume=31&amp;pages=2546-2549&amp;publication_year=2025&amp;author=Sandmann%2CS\" target=\"_blank\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR9\">Tordjman, M. et al. Comparative benchmarking of the DeepSeek large language model on medical tasks and clinical reasoning. Nat. Med. 31, 2550\u20132555 (2025).<\/p>\n<p class=\"c-article-references__links u-hide-print\"><a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"10.1038\/s41591-025-03726-3\" data-track-item_id=\"10.1038\/s41591-025-03726-3\" data-track-value=\"article reference\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1038%2Fs41591-025-03726-3\" aria-label=\"Article reference 9\" data-doi=\"10.1038\/s41591-025-03726-3\" target=\"_blank\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-item_id=\"link\" data-track-value=\"cas reference\" data-track-action=\"cas reference\" href=\"https:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BB2MXhvFCjt7bF\" aria-label=\"CAS reference 9\" target=\"_blank\">CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-item_id=\"link\" data-track-value=\"pubmed reference\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=40267969\" aria-label=\"PubMed reference 9\" target=\"_blank\">PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" data-track-action=\"google scholar reference\" data-track-value=\"google scholar reference\" data-track-label=\"link\" data-track-item_id=\"link\" rel=\"nofollow noopener\" aria-label=\"Google Scholar reference 9\" href=\"http:\/\/scholar.google.com\/scholar_lookup?&amp;title=Comparative%20benchmarking%20of%20the%20DeepSeek%20large%20language%20model%20on%20medical%20tasks%20and%20clinical%20reasoning&amp;journal=Nat.%20Med.&amp;doi=10.1038%2Fs41591-025-03726-3&amp;volume=31&amp;pages=2550-2555&amp;publication_year=2025&amp;author=Tordjman%2CM\" target=\"_blank\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR10\">Liu, X. et al. A generalist medical language model for disease diagnosis assistance. Nat. Med. 31, 932\u2013942 (2025).<\/p>\n<p class=\"c-article-references__links u-hide-print\"><a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"10.1038\/s41591-024-03416-6\" data-track-item_id=\"10.1038\/s41591-024-03416-6\" data-track-value=\"article reference\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1038%2Fs41591-024-03416-6\" aria-label=\"Article reference 10\" data-doi=\"10.1038\/s41591-024-03416-6\" target=\"_blank\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-item_id=\"link\" data-track-value=\"cas reference\" data-track-action=\"cas reference\" href=\"https:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BB2MXhtVaksrc%3D\" aria-label=\"CAS reference 10\" target=\"_blank\">CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-item_id=\"link\" data-track-value=\"pubmed reference\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=39779927\" aria-label=\"PubMed reference 10\" target=\"_blank\">PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" data-track-action=\"google scholar reference\" data-track-value=\"google scholar reference\" data-track-label=\"link\" data-track-item_id=\"link\" rel=\"nofollow noopener\" aria-label=\"Google Scholar reference 10\" href=\"http:\/\/scholar.google.com\/scholar_lookup?&amp;title=A%20generalist%20medical%20language%20model%20for%20disease%20diagnosis%20assistance&amp;journal=Nat.%20Med.&amp;doi=10.1038%2Fs41591-024-03416-6&amp;volume=31&amp;pages=932-942&amp;publication_year=2025&amp;author=Liu%2CX\" target=\"_blank\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR11\">Pal, A., Umapathi, L. K. &amp; Sankarasubbu, M. MedMCQA: a large-scale multi-subject multi-choice dataset for medical domain question answering. In Conference on Health, Inference, and Learning 248\u2013260 (PMLR, 2022).<\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR12\">Jin, Q., Dhingra, B., Liu, Z., Cohen, W. &amp; Lu, X. PubMedQA: a dataset for biomedical research question answering. In Proc. 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (eds Inui, K. et al.) 2567\u20132577 (ACL, 2019).<\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR13\">Jin, D. et al. What disease does this patient have? A large-scale open domain question answering dataset from medical exams. Appl. Sci. 11, 6421 (2021).<\/p>\n<p class=\"c-article-references__links u-hide-print\"><a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"10.3390\/app11146421\" data-track-item_id=\"10.3390\/app11146421\" data-track-value=\"article reference\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.3390%2Fapp11146421\" aria-label=\"Article reference 13\" data-doi=\"10.3390\/app11146421\" target=\"_blank\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-item_id=\"link\" data-track-value=\"cas reference\" data-track-action=\"cas reference\" href=\"https:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BB3MXitV2ru7vE\" aria-label=\"CAS reference 13\" target=\"_blank\">CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" data-track-action=\"google scholar reference\" data-track-value=\"google scholar reference\" data-track-label=\"link\" data-track-item_id=\"link\" rel=\"nofollow noopener\" aria-label=\"Google Scholar reference 13\" href=\"http:\/\/scholar.google.com\/scholar_lookup?&amp;title=What%20disease%20does%20this%20patient%20have%3F%20A%20large-scale%20open%20domain%20question%20answering%20dataset%20from%20medical%20exams&amp;journal=Appl.%20Sci.&amp;doi=10.3390%2Fapp11146421&amp;volume=11&amp;publication_year=2021&amp;author=Jin%2CD\" target=\"_blank\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR14\">Schmidgall, S. et al. AgentClinic: a multimodal agent benchmark to evaluate AI in simulated clinical environments. Preprint at <a href=\"https:\/\/doi.org\/10.48550\/arXiv.2405.07960\" data-track=\"click_references\" data-track-action=\"external reference\" data-track-value=\"external reference\" data-track-label=\"10.48550\/arXiv.2405.07960\" rel=\"nofollow noopener\" target=\"_blank\">https:\/\/doi.org\/10.48550\/arXiv.2405.07960<\/a> (2024).<\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR15\">Hager, P. et al. Evaluation and mitigation of the limitations of large language models in clinical decision-making. Nat. Med. 30, 2613\u20132622 (2024).<\/p>\n<p class=\"c-article-references__links u-hide-print\"><a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"10.1038\/s41591-024-03097-1\" data-track-item_id=\"10.1038\/s41591-024-03097-1\" data-track-value=\"article reference\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1038%2Fs41591-024-03097-1\" aria-label=\"Article reference 15\" data-doi=\"10.1038\/s41591-024-03097-1\" target=\"_blank\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-item_id=\"link\" data-track-value=\"cas reference\" data-track-action=\"cas reference\" href=\"https:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BB2cXhsV2ntLvM\" aria-label=\"CAS reference 15\" target=\"_blank\">CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-item_id=\"link\" data-track-value=\"pubmed reference\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=38965432\" aria-label=\"PubMed reference 15\" target=\"_blank\">PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-item_id=\"link\" data-track-value=\"pubmed central reference\" data-track-action=\"pubmed central reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC11405275\" aria-label=\"PubMed Central reference 15\" target=\"_blank\">PubMed Central<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" data-track-action=\"google scholar reference\" data-track-value=\"google scholar reference\" data-track-label=\"link\" data-track-item_id=\"link\" rel=\"nofollow noopener\" aria-label=\"Google Scholar reference 15\" href=\"http:\/\/scholar.google.com\/scholar_lookup?&amp;title=Evaluation%20and%20mitigation%20of%20the%20limitations%20of%20large%20language%20models%20in%20clinical%20decision-making&amp;journal=Nat.%20Med.&amp;doi=10.1038%2Fs41591-024-03097-1&amp;volume=30&amp;pages=2613-2622&amp;publication_year=2024&amp;author=Hager%2CP\" target=\"_blank\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR16\">Fan, Z. et al. AI Hospital: benchmarking large language models in a multi-agent medical interaction simulator. In Proc. 31st International Conference on Computational Linguistics 10183\u201310213 (ACL, 2025).<\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR17\">Li, J. et al. Agent Hospital: a simulacrum of hospital with evolvable medical agents. Preprint at <a href=\"https:\/\/doi.org\/10.48550\/arXiv.2405.02957\" data-track=\"click_references\" data-track-action=\"external reference\" data-track-value=\"external reference\" data-track-label=\"10.48550\/arXiv.2405.02957\" rel=\"nofollow noopener\" target=\"_blank\">https:\/\/doi.org\/10.48550\/arXiv.2405.02957<\/a> (2024).<\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR18\">Bedi, S. et al. Holistic evaluation of large language models for medical tasks with MedHELM. Nat. Med. <a href=\"https:\/\/doi.org\/10.1038\/s41591-025-04151-2\" data-track=\"click_references\" data-track-action=\"external reference\" data-track-value=\"external reference\" data-track-label=\"10.1038\/s41591-025-04151-2\" rel=\"nofollow noopener\" target=\"_blank\">https:\/\/doi.org\/10.1038\/s41591-025-04151-2<\/a> (2026).<\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR19\">Zhang, S. et al. Rethinking human-AI collaboration in complex medical decision making: a case study in sepsis diagnosis. In Proc. 2024 CHI Conference on Human Factors in Computing Systems 445, 1\u201318 (ACM, 2024).<\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR20\">Nori, H. et al. Sequential diagnosis with language models. Preprint at <a href=\"https:\/\/doi.org\/10.48550\/arXiv.2506.22405\" data-track=\"click_references\" data-track-action=\"external reference\" data-track-value=\"external reference\" data-track-label=\"10.48550\/arXiv.2506.22405\" rel=\"nofollow noopener\" target=\"_blank\">https:\/\/doi.org\/10.48550\/arXiv.2506.22405<\/a> (2025).<\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR21\">Bedi, S., Mlauzi, I., Shin, D., Koyejo, S. &amp; Shah, N. H. The optimization paradox in clinical AI multi-agent systems. Preprint at <a href=\"https:\/\/doi.org\/10.48550\/arXiv.2506.06574\" data-track=\"click_references\" data-track-action=\"external reference\" data-track-value=\"external reference\" data-track-label=\"10.48550\/arXiv.2506.06574\" rel=\"nofollow noopener\" target=\"_blank\">https:\/\/doi.org\/10.48550\/arXiv.2506.06574<\/a> (2025).<\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR22\">Rosenthal, J. T., Beecy, A. &amp; Sabuncu, M. R. Rethinking clinical trials for medical AI with dynamic deployments of adaptive systems. NPJ Digit. Med. 8, 252 (2025).<\/p>\n<p class=\"c-article-references__links u-hide-print\"><a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"10.1038\/s41746-025-01674-3\" data-track-item_id=\"10.1038\/s41746-025-01674-3\" data-track-value=\"article reference\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1038%2Fs41746-025-01674-3\" aria-label=\"Article reference 22\" data-doi=\"10.1038\/s41746-025-01674-3\" target=\"_blank\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-item_id=\"link\" data-track-value=\"pubmed reference\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=40328886\" aria-label=\"PubMed reference 22\" target=\"_blank\">PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-item_id=\"link\" data-track-value=\"pubmed central reference\" data-track-action=\"pubmed central reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC12056174\" aria-label=\"PubMed Central reference 22\" target=\"_blank\">PubMed Central<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" data-track-action=\"google scholar reference\" data-track-value=\"google scholar reference\" data-track-label=\"link\" data-track-item_id=\"link\" rel=\"nofollow noopener\" aria-label=\"Google Scholar reference 22\" href=\"http:\/\/scholar.google.com\/scholar_lookup?&amp;title=Rethinking%20clinical%20trials%20for%20medical%20AI%20with%20dynamic%20deployments%20of%20adaptive%20systems&amp;journal=NPJ%20Digit.%20Med.&amp;doi=10.1038%2Fs41746-025-01674-3&amp;volume=8&amp;publication_year=2025&amp;author=Rosenthal%2CJT&amp;author=Beecy%2CA&amp;author=Sabuncu%2CMR\" target=\"_blank\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR23\">Palepu, A. et al. Towards conversational AI for disease management. Preprint at <a href=\"https:\/\/doi.org\/10.48550\/arXiv.2503.06074\" data-track=\"click_references\" data-track-action=\"external reference\" data-track-value=\"external reference\" data-track-label=\"10.48550\/arXiv.2503.06074\" rel=\"nofollow noopener\" target=\"_blank\">https:\/\/doi.org\/10.48550\/arXiv.2503.06074<\/a> (2025).<\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR24\">Johnson, A. E. W. et al. MIMIC-III, a freely accessible critical care database. Sci. Data 3, 160035 (2016).<\/p>\n<p class=\"c-article-references__links u-hide-print\"><a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"10.1038\/sdata.2016.35\" data-track-item_id=\"10.1038\/sdata.2016.35\" data-track-value=\"article reference\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1038%2Fsdata.2016.35\" aria-label=\"Article reference 24\" data-doi=\"10.1038\/sdata.2016.35\" target=\"_blank\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-item_id=\"link\" data-track-value=\"cas reference\" data-track-action=\"cas reference\" href=\"https:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BC28Xos1Wnu74%3D\" aria-label=\"CAS reference 24\" target=\"_blank\">CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-item_id=\"link\" data-track-value=\"pubmed reference\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=27219127\" aria-label=\"PubMed reference 24\" target=\"_blank\">PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-item_id=\"link\" data-track-value=\"pubmed central reference\" data-track-action=\"pubmed central reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC4878278\" aria-label=\"PubMed Central reference 24\" target=\"_blank\">PubMed Central<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" data-track-action=\"google scholar reference\" data-track-value=\"google scholar reference\" data-track-label=\"link\" data-track-item_id=\"link\" rel=\"nofollow noopener\" aria-label=\"Google Scholar reference 24\" href=\"http:\/\/scholar.google.com\/scholar_lookup?&amp;title=MIMIC-III%2C%20a%20freely%20accessible%20critical%20care%20database&amp;journal=Sci.%20Data&amp;doi=10.1038%2Fsdata.2016.35&amp;volume=3&amp;publication_year=2016&amp;author=Johnson%2CAEW\" target=\"_blank\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR25\">Johnson, A. E. W. et al. MIMIC-IV, a freely accessible electronic health record dataset. Sci. Data 10, 1 (2023).<\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR26\">Kansal, A., Chen, E., Jin, B. T., Rajpurkar, P. &amp; Kim, D. A. MC-MED, multimodal clinical monitoring in the emergency department. Sci. Data 12, 1094 (2025).<\/p>\n<p class=\"c-article-references__links u-hide-print\"><a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"10.1038\/s41597-025-05419-5\" data-track-item_id=\"10.1038\/s41597-025-05419-5\" data-track-value=\"article reference\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1038%2Fs41597-025-05419-5\" aria-label=\"Article reference 26\" data-doi=\"10.1038\/s41597-025-05419-5\" target=\"_blank\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-item_id=\"link\" data-track-value=\"pubmed reference\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=40593787\" aria-label=\"PubMed reference 26\" target=\"_blank\">PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-item_id=\"link\" data-track-value=\"pubmed central reference\" data-track-action=\"pubmed central reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC12216331\" aria-label=\"PubMed Central reference 26\" target=\"_blank\">PubMed Central<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" data-track-action=\"google scholar reference\" data-track-value=\"google scholar reference\" data-track-label=\"link\" data-track-item_id=\"link\" rel=\"nofollow noopener\" aria-label=\"Google Scholar reference 26\" href=\"http:\/\/scholar.google.com\/scholar_lookup?&amp;title=MC-MED%2C%20multimodal%20clinical%20monitoring%20in%20the%20emergency%20department&amp;journal=Sci.%20Data&amp;doi=10.1038%2Fs41597-025-05419-5&amp;volume=12&amp;publication_year=2025&amp;author=Kansal%2CA&amp;author=Chen%2CE&amp;author=Jin%2CBT&amp;author=Rajpurkar%2CP&amp;author=Kim%2CDA\" target=\"_blank\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR27\">Lazic, D. A., Grujic, V. &amp; Tanaskovic, M. The role of flight simulation in flight training of pilots for crisis management. SFJD 3, 3624\u20133636 (2022).<\/p>\n<p class=\"c-article-references__links u-hide-print\"><a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"10.46932\/sfjdv3n3-046\" data-track-item_id=\"10.46932\/sfjdv3n3-046\" data-track-value=\"article reference\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.46932%2Fsfjdv3n3-046\" aria-label=\"Article reference 27\" data-doi=\"10.46932\/sfjdv3n3-046\" target=\"_blank\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" data-track-action=\"google scholar reference\" data-track-value=\"google scholar reference\" data-track-label=\"link\" data-track-item_id=\"link\" rel=\"nofollow noopener\" aria-label=\"Google Scholar reference 27\" href=\"http:\/\/scholar.google.com\/scholar_lookup?&amp;title=The%20role%20of%20flight%20simulation%20in%20flight%20training%20of%20pilots%20for%20crisis%20management&amp;journal=SFJD&amp;doi=10.46932%2Fsfjdv3n3-046&amp;volume=3&amp;pages=3624-3636&amp;publication_year=2022&amp;author=Lazic%2CDA&amp;author=Grujic%2CV&amp;author=Tanaskovic%2CM\" target=\"_blank\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR28\">Allerton, D. J. The impact of flight simulation in aerospace. Aeronaut. J. 114, 747\u2013756 (2010).<\/p>\n<p class=\"c-article-references__links u-hide-print\"><a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"10.1017\/S0001924000004231\" data-track-item_id=\"10.1017\/S0001924000004231\" data-track-value=\"article reference\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1017%2FS0001924000004231\" aria-label=\"Article reference 28\" data-doi=\"10.1017\/S0001924000004231\" target=\"_blank\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" data-track-action=\"google scholar reference\" data-track-value=\"google scholar reference\" data-track-label=\"link\" data-track-item_id=\"link\" rel=\"nofollow noopener\" aria-label=\"Google Scholar reference 28\" href=\"http:\/\/scholar.google.com\/scholar_lookup?&amp;title=The%20impact%20of%20flight%20simulation%20in%20aerospace&amp;journal=Aeronaut.%20J.&amp;doi=10.1017%2FS0001924000004231&amp;volume=114&amp;pages=747-756&amp;publication_year=2010&amp;author=Allerton%2CDJ\" target=\"_blank\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR29\">Mahmood, F. A benchmarking crisis in biomedical machine learning. Nat. Med. 31, 1060 (2025).<\/p>\n<p class=\"c-article-references__links u-hide-print\"><a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"10.1038\/s41591-025-03637-3\" data-track-item_id=\"10.1038\/s41591-025-03637-3\" data-track-value=\"article reference\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1038%2Fs41591-025-03637-3\" aria-label=\"Article reference 29\" data-doi=\"10.1038\/s41591-025-03637-3\" target=\"_blank\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-item_id=\"link\" data-track-value=\"cas reference\" data-track-action=\"cas reference\" href=\"https:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BB2MXptVOrtr4%3D\" aria-label=\"CAS reference 29\" target=\"_blank\">CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-item_id=\"link\" data-track-value=\"pubmed reference\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=40200055\" aria-label=\"PubMed reference 29\" target=\"_blank\">PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" data-track-action=\"google scholar reference\" data-track-value=\"google scholar reference\" data-track-label=\"link\" data-track-item_id=\"link\" rel=\"nofollow noopener\" aria-label=\"Google Scholar reference 29\" href=\"http:\/\/scholar.google.com\/scholar_lookup?&amp;title=A%20benchmarking%20crisis%20in%20biomedical%20machine%20learning&amp;journal=Nat.%20Med.&amp;doi=10.1038%2Fs41591-025-03637-3&amp;volume=31&amp;publication_year=2025&amp;author=Mahmood%2CF\" target=\"_blank\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR30\">Silver, D. et al. Mastering the game of Go without human knowledge. Nature 550, 354\u2013359 (2017).<\/p>\n<p class=\"c-article-references__links u-hide-print\"><a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"10.1038\/nature24270\" data-track-item_id=\"10.1038\/nature24270\" data-track-value=\"article reference\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1038%2Fnature24270\" aria-label=\"Article reference 30\" data-doi=\"10.1038\/nature24270\" target=\"_blank\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-item_id=\"link\" data-track-value=\"cas reference\" data-track-action=\"cas reference\" href=\"https:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BC2sXhs12ltLvM\" aria-label=\"CAS reference 30\" target=\"_blank\">CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-item_id=\"link\" data-track-value=\"pubmed reference\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=29052630\" aria-label=\"PubMed reference 30\" target=\"_blank\">PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" data-track-action=\"google scholar reference\" data-track-value=\"google scholar reference\" data-track-label=\"link\" data-track-item_id=\"link\" rel=\"nofollow noopener\" aria-label=\"Google Scholar reference 30\" href=\"http:\/\/scholar.google.com\/scholar_lookup?&amp;title=Mastering%20the%20game%20of%20Go%20without%20human%20knowledge&amp;journal=Nature&amp;doi=10.1038%2Fnature24270&amp;volume=550&amp;pages=354-359&amp;publication_year=2017&amp;author=Silver%2CD\" target=\"_blank\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR31\">Silver, D. et al. Mastering the game of Go with deep neural networks and tree search. Nature 529, 484\u2013489 (2016).<\/p>\n<p class=\"c-article-references__links u-hide-print\"><a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"10.1038\/nature16961\" data-track-item_id=\"10.1038\/nature16961\" data-track-value=\"article reference\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1038%2Fnature16961\" aria-label=\"Article reference 31\" data-doi=\"10.1038\/nature16961\" target=\"_blank\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-item_id=\"link\" data-track-value=\"cas reference\" data-track-action=\"cas reference\" href=\"https:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BC28Xhs12is7w%3D\" aria-label=\"CAS reference 31\" target=\"_blank\">CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-item_id=\"link\" data-track-value=\"pubmed reference\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=26819042\" aria-label=\"PubMed reference 31\" target=\"_blank\">PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" data-track-action=\"google scholar reference\" data-track-value=\"google scholar reference\" data-track-label=\"link\" data-track-item_id=\"link\" rel=\"nofollow noopener\" aria-label=\"Google Scholar reference 31\" href=\"http:\/\/scholar.google.com\/scholar_lookup?&amp;title=Mastering%20the%20game%20of%20Go%20with%20deep%20neural%20networks%20and%20tree%20search&amp;journal=Nature&amp;doi=10.1038%2Fnature16961&amp;volume=529&amp;pages=484-489&amp;publication_year=2016&amp;author=Silver%2CD\" target=\"_blank\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR32\">Page, B., Irving, D., Amalberti, R. &amp; Vincent, C. Health services under pressure: a scoping review and development of a taxonomy of adaptive strategies. BMJ Qual. Saf. 33, 738\u2013747 (2024).<\/p>\n<p class=\"c-article-references__links u-hide-print\"><a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"10.1136\/bmjqs-2023-016686\" data-track-item_id=\"10.1136\/bmjqs-2023-016686\" data-track-value=\"article reference\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1136%2Fbmjqs-2023-016686\" aria-label=\"Article reference 32\" data-doi=\"10.1136\/bmjqs-2023-016686\" target=\"_blank\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-item_id=\"link\" data-track-value=\"pubmed reference\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=38050158\" aria-label=\"PubMed reference 32\" target=\"_blank\">PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-item_id=\"link\" data-track-value=\"pubmed central reference\" data-track-action=\"pubmed central reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC11503202\" aria-label=\"PubMed Central reference 32\" target=\"_blank\">PubMed Central<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" data-track-action=\"google scholar reference\" data-track-value=\"google scholar reference\" data-track-label=\"link\" data-track-item_id=\"link\" rel=\"nofollow noopener\" aria-label=\"Google Scholar reference 32\" href=\"http:\/\/scholar.google.com\/scholar_lookup?&amp;title=Health%20services%20under%20pressure%3A%20a%20scoping%20review%20and%20development%20of%20a%20taxonomy%20of%20adaptive%20strategies&amp;journal=BMJ%20Qual.%20Saf.&amp;doi=10.1136%2Fbmjqs-2023-016686&amp;volume=33&amp;pages=738-747&amp;publication_year=2024&amp;author=Page%2CB&amp;author=Irving%2CD&amp;author=Amalberti%2CR&amp;author=Vincent%2CC\" target=\"_blank\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR33\">Morley, C., Unwin, M., Peterson, G. M., Stankovich, J. &amp; Kinsman, L. Emergency department crowding: a systematic review of causes, consequences and solutions. PLoS ONE 13, e0203316 (2018).<\/p>\n<p class=\"c-article-references__links u-hide-print\"><a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"10.1371\/journal.pone.0203316\" data-track-item_id=\"10.1371\/journal.pone.0203316\" data-track-value=\"article reference\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1371%2Fjournal.pone.0203316\" aria-label=\"Article reference 33\" data-doi=\"10.1371\/journal.pone.0203316\" target=\"_blank\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-item_id=\"link\" data-track-value=\"pubmed reference\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=30161242\" aria-label=\"PubMed reference 33\" target=\"_blank\">PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-item_id=\"link\" data-track-value=\"pubmed central reference\" data-track-action=\"pubmed central reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC6117060\" aria-label=\"PubMed Central reference 33\" target=\"_blank\">PubMed Central<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" data-track-action=\"google scholar reference\" data-track-value=\"google scholar reference\" data-track-label=\"link\" data-track-item_id=\"link\" rel=\"nofollow noopener\" aria-label=\"Google Scholar reference 33\" href=\"http:\/\/scholar.google.com\/scholar_lookup?&amp;title=Emergency%20department%20crowding%3A%20a%20systematic%20review%20of%20causes%2C%20consequences%20and%20solutions&amp;journal=PLoS%20ONE&amp;doi=10.1371%2Fjournal.pone.0203316&amp;volume=13&amp;publication_year=2018&amp;author=Morley%2CC&amp;author=Unwin%2CM&amp;author=Peterson%2CGM&amp;author=Stankovich%2CJ&amp;author=Kinsman%2CL\" target=\"_blank\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR34\">Pines, J. M. et al. The impact of emergency department crowding measures on time to antibiotics for patients with community-acquired pneumonia. Ann. Emerg. Med. 50, 510\u2013516 (2007).<\/p>\n<p class=\"c-article-references__links u-hide-print\"><a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"10.1016\/j.annemergmed.2007.07.021\" data-track-item_id=\"10.1016\/j.annemergmed.2007.07.021\" data-track-value=\"article reference\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1016%2Fj.annemergmed.2007.07.021\" aria-label=\"Article reference 34\" data-doi=\"10.1016\/j.annemergmed.2007.07.021\" target=\"_blank\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-item_id=\"link\" data-track-value=\"pubmed reference\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=17913298\" aria-label=\"PubMed reference 34\" target=\"_blank\">PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" data-track-action=\"google scholar reference\" data-track-value=\"google scholar reference\" data-track-label=\"link\" data-track-item_id=\"link\" rel=\"nofollow noopener\" aria-label=\"Google Scholar reference 34\" href=\"http:\/\/scholar.google.com\/scholar_lookup?&amp;title=The%20impact%20of%20emergency%20department%20crowding%20measures%20on%20time%20to%20antibiotics%20for%20patients%20with%20community-acquired%20pneumonia&amp;journal=Ann.%20Emerg.%20Med.&amp;doi=10.1016%2Fj.annemergmed.2007.07.021&amp;volume=50&amp;pages=510-516&amp;publication_year=2007&amp;author=Pines%2CJM\" target=\"_blank\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR35\">Bernstein, S. L. et al. The effect of emergency department crowding on clinically oriented outcomes. Acad. Emerg. Med. 16, 1\u201310 (2009).<\/p>\n<p class=\"c-article-references__links u-hide-print\"><a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"10.1111\/j.1553-2712.2008.00295.x\" data-track-item_id=\"10.1111\/j.1553-2712.2008.00295.x\" data-track-value=\"article reference\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1111%2Fj.1553-2712.2008.00295.x\" aria-label=\"Article reference 35\" data-doi=\"10.1111\/j.1553-2712.2008.00295.x\" target=\"_blank\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-item_id=\"link\" data-track-value=\"pubmed reference\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=19007346\" aria-label=\"PubMed reference 35\" target=\"_blank\">PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" data-track-action=\"google scholar reference\" data-track-value=\"google scholar reference\" data-track-label=\"link\" data-track-item_id=\"link\" rel=\"nofollow noopener\" aria-label=\"Google Scholar reference 35\" href=\"http:\/\/scholar.google.com\/scholar_lookup?&amp;title=The%20effect%20of%20emergency%20department%20crowding%20on%20clinically%20oriented%20outcomes&amp;journal=Acad.%20Emerg.%20Med.&amp;doi=10.1111%2Fj.1553-2712.2008.00295.x&amp;volume=16&amp;pages=1-10&amp;publication_year=2009&amp;author=Bernstein%2CSL\" target=\"_blank\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR36\">Emanuel, E. J. et al. Fair allocation of scarce medical resources in the time of Covid-19. N. Engl. J. Med. 382, 2049\u20132055 (2020).<\/p>\n<p class=\"c-article-references__links u-hide-print\"><a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"10.1056\/NEJMsb2005114\" data-track-item_id=\"10.1056\/NEJMsb2005114\" data-track-value=\"article reference\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1056%2FNEJMsb2005114\" aria-label=\"Article reference 36\" data-doi=\"10.1056\/NEJMsb2005114\" target=\"_blank\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-item_id=\"link\" data-track-value=\"pubmed reference\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=32202722\" aria-label=\"PubMed reference 36\" target=\"_blank\">PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" data-track-action=\"google scholar reference\" data-track-value=\"google scholar reference\" data-track-label=\"link\" data-track-item_id=\"link\" rel=\"nofollow noopener\" aria-label=\"Google Scholar reference 36\" href=\"http:\/\/scholar.google.com\/scholar_lookup?&amp;title=Fair%20allocation%20of%20scarce%20medical%20resources%20in%20the%20time%20of%20Covid-19&amp;journal=N.%20Engl.%20J.%20Med.&amp;doi=10.1056%2FNEJMsb2005114&amp;volume=382&amp;pages=2049-2055&amp;publication_year=2020&amp;author=Emanuel%2CEJ\" target=\"_blank\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR37\">Johri, S. et al. An evaluation framework for clinical use of large language models in patient interaction tasks. Nat. Med. 31, 77\u201386 (2025).<\/p>\n<p class=\"c-article-references__links u-hide-print\"><a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"10.1038\/s41591-024-03328-5\" data-track-item_id=\"10.1038\/s41591-024-03328-5\" data-track-value=\"article reference\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1038%2Fs41591-024-03328-5\" aria-label=\"Article reference 37\" data-doi=\"10.1038\/s41591-024-03328-5\" target=\"_blank\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-item_id=\"link\" data-track-value=\"cas reference\" data-track-action=\"cas reference\" href=\"https:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BB2MXksVKjtw%3D%3D\" aria-label=\"CAS reference 37\" target=\"_blank\">CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-item_id=\"link\" data-track-value=\"pubmed reference\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=39747685\" aria-label=\"PubMed reference 37\" target=\"_blank\">PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" data-track-action=\"google scholar reference\" data-track-value=\"google scholar reference\" data-track-label=\"link\" data-track-item_id=\"link\" rel=\"nofollow noopener\" aria-label=\"Google Scholar reference 37\" href=\"http:\/\/scholar.google.com\/scholar_lookup?&amp;title=An%20evaluation%20framework%20for%20clinical%20use%20of%20large%20language%20models%20in%20patient%20interaction%20tasks&amp;journal=Nat.%20Med.&amp;doi=10.1038%2Fs41591-024-03328-5&amp;volume=31&amp;pages=77-86&amp;publication_year=2025&amp;author=Johri%2CS\" target=\"_blank\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR38\">Arora, R. K. et al. HealthBench: Evaluating large language models towards improved human health. Preprint at <a href=\"https:\/\/doi.org\/10.48550\/arXiv.2505.08775\" data-track=\"click_references\" data-track-action=\"external reference\" data-track-value=\"external reference\" data-track-label=\"10.48550\/arXiv.2505.08775\" rel=\"nofollow noopener\" target=\"_blank\">https:\/\/doi.org\/10.48550\/arXiv.2505.08775<\/a> (2025).<\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR39\">Jiang, Y. et al. MedAgentBench: a virtual EHR environment to benchmark medical LLM agents. NEJM AI 2, 9 (2025).<\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR40\">Zhang, C. et al. API agents vs. GUI agents: divergence and convergence. In ICML 2025 Workshop on Computer Use Agents (ICML, 2025).<\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR41\">Finlayson, S. G. et al. Adversarial attacks on medical machine learning. Science 363, 1287\u20131289 (2019).<\/p>\n<p class=\"c-article-references__links u-hide-print\"><a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"10.1126\/science.aaw4399\" data-track-item_id=\"10.1126\/science.aaw4399\" data-track-value=\"article reference\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1126%2Fscience.aaw4399\" aria-label=\"Article reference 41\" data-doi=\"10.1126\/science.aaw4399\" target=\"_blank\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-item_id=\"link\" data-track-value=\"cas reference\" data-track-action=\"cas reference\" href=\"https:\/\/www.nature.com\/articles\/cas-redirect\/1:CAS:528:DC%2BC1MXhtFCisr7M\" aria-label=\"CAS reference 41\" target=\"_blank\">CAS<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-item_id=\"link\" data-track-value=\"pubmed reference\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=30898923\" aria-label=\"PubMed reference 41\" target=\"_blank\">PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-item_id=\"link\" data-track-value=\"pubmed central reference\" data-track-action=\"pubmed central reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC7657648\" aria-label=\"PubMed Central reference 41\" target=\"_blank\">PubMed Central<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" data-track-action=\"google scholar reference\" data-track-value=\"google scholar reference\" data-track-label=\"link\" data-track-item_id=\"link\" rel=\"nofollow noopener\" aria-label=\"Google Scholar reference 41\" href=\"http:\/\/scholar.google.com\/scholar_lookup?&amp;title=Adversarial%20attacks%20on%20medical%20machine%20learning&amp;journal=Science&amp;doi=10.1126%2Fscience.aaw4399&amp;volume=363&amp;pages=1287-1289&amp;publication_year=2019&amp;author=Finlayson%2CSG\" target=\"_blank\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR42\">Javed, H., El-Sappagh, S. &amp; Abuhmed, T. Robustness in deep learning models for medical diagnostics: security and adversarial challenges towards robust AI applications. Artif. Intell. Rev. 58, 12 (2024).<\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR43\">Kumar, A. et al. OrderRex clinical user testing: a randomized trial of recommender system decision support on simulated cases. J. Am. Med. Inform. Assoc. 27, 1850\u20131859 (2020).<\/p>\n<p class=\"c-article-references__links u-hide-print\"><a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"10.1093\/jamia\/ocaa190\" data-track-item_id=\"10.1093\/jamia\/ocaa190\" data-track-value=\"article reference\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1093%2Fjamia%2Focaa190\" aria-label=\"Article reference 43\" data-doi=\"10.1093\/jamia\/ocaa190\" target=\"_blank\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-item_id=\"link\" data-track-value=\"pubmed reference\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=33106874\" aria-label=\"PubMed reference 43\" target=\"_blank\">PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-item_id=\"link\" data-track-value=\"pubmed central reference\" data-track-action=\"pubmed central reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC7727352\" aria-label=\"PubMed Central reference 43\" target=\"_blank\">PubMed Central<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" data-track-action=\"google scholar reference\" data-track-value=\"google scholar reference\" data-track-label=\"link\" data-track-item_id=\"link\" rel=\"nofollow noopener\" aria-label=\"Google Scholar reference 43\" href=\"http:\/\/scholar.google.com\/scholar_lookup?&amp;title=OrderRex%20clinical%20user%20testing%3A%20a%20randomized%20trial%20of%20recommender%20system%20decision%20support%20on%20simulated%20cases&amp;journal=J.%20Am.%20Med.%20Inform.%20Assoc.&amp;doi=10.1093%2Fjamia%2Focaa190&amp;volume=27&amp;pages=1850-1859&amp;publication_year=2020&amp;author=Kumar%2CA\" target=\"_blank\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR44\">Elendu, C. et al. The impact of simulation-based training in medical education: a review. Medicine 103, e38813 (2024).<\/p>\n<p class=\"c-article-references__links u-hide-print\"><a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"10.1097\/MD.0000000000038813\" data-track-item_id=\"10.1097\/MD.0000000000038813\" data-track-value=\"article reference\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1097%2FMD.0000000000038813\" aria-label=\"Article reference 44\" data-doi=\"10.1097\/MD.0000000000038813\" target=\"_blank\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-item_id=\"link\" data-track-value=\"pubmed reference\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=38968472\" aria-label=\"PubMed reference 44\" target=\"_blank\">PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-item_id=\"link\" data-track-value=\"pubmed central reference\" data-track-action=\"pubmed central reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC11224887\" aria-label=\"PubMed Central reference 44\" target=\"_blank\">PubMed Central<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" data-track-action=\"google scholar reference\" data-track-value=\"google scholar reference\" data-track-label=\"link\" data-track-item_id=\"link\" rel=\"nofollow noopener\" aria-label=\"Google Scholar reference 44\" href=\"http:\/\/scholar.google.com\/scholar_lookup?&amp;title=The%20impact%20of%20simulation-based%20training%20in%20medical%20education%3A%20a%20review&amp;journal=Medicine&amp;doi=10.1097%2FMD.0000000000038813&amp;volume=103&amp;publication_year=2024&amp;author=Elendu%2CC\" target=\"_blank\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR45\">Sinsky, C. et al. Allocation of physician time in ambulatory practice: a time and motion study in 4 specialties. Ann. Intern. Med. 165, 753\u2013760 (2016).<\/p>\n<p class=\"c-article-references__links u-hide-print\"><a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"10.7326\/M16-0961\" data-track-item_id=\"10.7326\/M16-0961\" data-track-value=\"article reference\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.7326%2FM16-0961\" aria-label=\"Article reference 45\" data-doi=\"10.7326\/M16-0961\" target=\"_blank\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"link\" data-track-item_id=\"link\" data-track-value=\"pubmed reference\" data-track-action=\"pubmed reference\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=27595430\" aria-label=\"PubMed reference 45\" target=\"_blank\">PubMed<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" data-track-action=\"google scholar reference\" data-track-value=\"google scholar reference\" data-track-label=\"link\" data-track-item_id=\"link\" rel=\"nofollow noopener\" aria-label=\"Google Scholar reference 45\" href=\"http:\/\/scholar.google.com\/scholar_lookup?&amp;title=Allocation%20of%20physician%20time%20in%20ambulatory%20practice%3A%20a%20time%20and%20motion%20study%20in%204%20specialties&amp;journal=Ann.%20Intern.%20Med.&amp;doi=10.7326%2FM16-0961&amp;volume=165&amp;pages=753-760&amp;publication_year=2016&amp;author=Sinsky%2CC\" target=\"_blank\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR46\">Tierney, A. A. et al. Ambient artificial intelligence scribes: Learnings after 1 year and over 2.5 million uses. NEJM Catal. Innov. Care Deliv. <a href=\"https:\/\/doi.org\/10.1056\/CAT.25.0040\" data-track=\"click_references\" data-track-action=\"external reference\" data-track-value=\"external reference\" data-track-label=\"10.1056\/CAT.25.0040\" rel=\"nofollow noopener\" target=\"_blank\">https:\/\/doi.org\/10.1056\/CAT.25.0040<\/a> (2025).<\/p>\n","protected":false},"excerpt":{"rendered":"Goh, E. et al. Large language model influence on diagnostic reasoning: a randomized clinical trial. JAMA Netw. Open&hellip;\n","protected":false},"author":2,"featured_media":519780,"comment_status":"","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[59],"tags":[258,8869,8006,257,97,252,253,8871,8870,8872,8873],"class_list":{"0":"post-519779","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-health-care","8":"tag-biomedicine","9":"tag-cancer-research","10":"tag-diseases","11":"tag-general","12":"tag-health","13":"tag-health-care","14":"tag-healthcare","15":"tag-infectious-diseases","16":"tag-metabolic-diseases","17":"tag-molecular-medicine","18":"tag-neurosciences"},"_links":{"self":[{"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/posts\/519779","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/comments?post=519779"}],"version-history":[{"count":0,"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/posts\/519779\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/media\/519780"}],"wp:attachment":[{"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/media?parent=519779"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/categories?post=519779"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/tags?post=519779"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}