{"id":443242,"date":"2026-02-24T16:22:09","date_gmt":"2026-02-24T16:22:09","guid":{"rendered":"https:\/\/www.newsbeep.com\/uk\/443242\/"},"modified":"2026-02-24T16:22:09","modified_gmt":"2026-02-24T16:22:09","slug":"a-large-scale-randomized-study-of-large-language-model-feedback-in-peer-review","status":"publish","type":"post","link":"https:\/\/www.newsbeep.com\/uk\/443242\/","title":{"rendered":"A large-scale randomized study of large language model feedback in peer review"},"content":{"rendered":"<p class=\"c-article-references__text\" id=\"ref-CR1\">Alberts, B., Hanson, B. &amp; Kelner, K. L. Editorial: reviewing peer review. Science 321, 15\u201315 (2008).<\/p>\n<p class=\"c-article-references__links u-hide-print\"><a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"10.1126\/science.1162115\" data-track-item_id=\"10.1126\/science.1162115\" data-track-value=\"article reference\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1126%2Fscience.1162115\" aria-label=\"Article reference 1\" data-doi=\"10.1126\/science.1162115\" target=\"_blank\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" data-track-action=\"google scholar reference\" data-track-value=\"google scholar reference\" data-track-label=\"link\" data-track-item_id=\"link\" rel=\"nofollow noopener\" aria-label=\"Google Scholar reference 1\" href=\"http:\/\/scholar.google.com\/scholar_lookup?&amp;title=Editorial%3A%20reviewing%20peer%20review&amp;journal=Science&amp;doi=10.1126%2Fscience.1162115&amp;volume=321&amp;pages=15-15&amp;publication_year=2008&amp;author=Alberts%2CB&amp;author=Hanson%2CB&amp;author=Kelner%2CKL\" target=\"_blank\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR2\">Kelly, J., Sadeghieh, T. &amp; Adeli, K. Peer review in scientific publications: benefits, critiques, &amp; a survival guide. EJIFCC 25, 227 (2014).<\/p>\n<p class=\"c-article-references__links u-hide-print\"><a data-track=\"click_references\" data-track-action=\"google scholar reference\" data-track-value=\"google scholar reference\" data-track-label=\"link\" data-track-item_id=\"link\" rel=\"nofollow noopener\" aria-label=\"Google Scholar reference 2\" href=\"http:\/\/scholar.google.com\/scholar_lookup?&amp;title=Peer%20review%20in%20scientific%20publications%3A%20benefits%2C%20critiques%2C%20%26%20a%20survival%20guide&amp;journal=EJIFCC&amp;volume=25&amp;publication_year=2014&amp;author=Kelly%2CJ&amp;author=Sadeghieh%2CT&amp;author=Adeli%2CK\" target=\"_blank\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR3\">Publons Global State of Peer Review 2018 (Clarivate Analytics, 2018).<\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR4\">Azad, A. &amp; Banu, A. Publication trends in artificial intelligence conferences: the rise of super prolific authors. Preprint at <a href=\"https:\/\/doi.org\/10.48550\/arXiv.2412.07793\" data-track=\"click_references\" data-track-action=\"external reference\" data-track-value=\"external reference\" data-track-label=\"10.48550\/arXiv.2412.07793\" rel=\"nofollow noopener\" target=\"_blank\">https:\/\/doi.org\/10.48550\/arXiv.2412.07793<\/a> (2024).<\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR5\">McCook, A. Is peer review broken? Submissions are up, reviewers are overtaxed, and authors are lodging complaint after complaint about the process at top-tier journals. What\u2019s wrong with peer review? The Scientist (1 February 2006).<\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR6\">Thakkar, N. et al. Assisting ICLR 2025 reviewers with feedback. Blog post by the ICLR 2025 Review Feedback Agent Team and Program Chairs. ICLR Blog <a href=\"https:\/\/blog.iclr.cc\/2024\/10\/09\/iclr2025-assisting-reviewers\/\" data-track=\"click_references\" data-track-action=\"external reference\" data-track-value=\"external reference\" data-track-label=\"https:\/\/blog.iclr.cc\/2024\/10\/09\/iclr2025-assisting-reviewers\/\" rel=\"nofollow noopener\" target=\"_blank\">https:\/\/blog.iclr.cc\/2024\/10\/09\/iclr2025-assisting-reviewers\/<\/a> (2024).<\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR7\">Rogers, A. &amp; Augenstein, I. What can we do to improve peer review in NLP? In Findings of the Association for Computational Linguistics: EMNLP 2020 (eds Cohn, T., He, Y. &amp; Liu, Y.) 1256\u20131262 (ACL, 2020).<\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR8\">Rogers, A., Karpinska, M., Boyd-Graber, J. &amp; Okazaki, N. Program chairs\u2019 report on peer review at acl 2023. In Proc. 61st Annual Meeting of the Association for Computational Linguistics Vol. 1, xl\u2013lxxv (ACL, 2023).<\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR9\">Arns, M. Open access is tiring out peer reviewers. Nature 515, 467 (2014).<\/p>\n<p class=\"c-article-references__links u-hide-print\"><a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"10.1038\/515467a\" data-track-item_id=\"10.1038\/515467a\" data-track-value=\"article reference\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1038%2F515467a\" aria-label=\"Article reference 9\" data-doi=\"10.1038\/515467a\" target=\"_blank\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" data-track-action=\"google scholar reference\" data-track-value=\"google scholar reference\" data-track-label=\"link\" data-track-item_id=\"link\" rel=\"nofollow noopener\" aria-label=\"Google Scholar reference 9\" href=\"http:\/\/scholar.google.com\/scholar_lookup?&amp;title=Open%20access%20is%20tiring%20out%20peer%20reviewers&amp;journal=Nature&amp;doi=10.1038%2F515467a&amp;volume=515&amp;publication_year=2014&amp;author=Arns%2CM\" target=\"_blank\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR10\">Cortes, C. &amp; Lawrence, N. D. Inconsistency in conference peer review: revisiting the 2014 NeurIPS experiment. Preprint at <a href=\"https:\/\/doi.org\/10.48550\/arXiv.2109.09774\" data-track=\"click_references\" data-track-action=\"external reference\" data-track-value=\"external reference\" data-track-label=\"10.48550\/arXiv.2109.09774\" rel=\"nofollow noopener\" target=\"_blank\">https:\/\/doi.org\/10.48550\/arXiv.2109.09774<\/a> (2021).<\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR11\">Claude 3.5 Sonnet (Anthropic, 2024).<\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR12\">Liang, W. et al. Can large language models provide useful feedback on research papers? a large-scale empirical analysis. NEJM AI 1, AIoa2400196 (2024).<\/p>\n<p class=\"c-article-references__links u-hide-print\"><a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"10.1056\/AIoa2400196\" data-track-item_id=\"10.1056\/AIoa2400196\" data-track-value=\"article reference\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1056%2FAIoa2400196\" aria-label=\"Article reference 12\" data-doi=\"10.1056\/AIoa2400196\" target=\"_blank\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" data-track-action=\"google scholar reference\" data-track-value=\"google scholar reference\" data-track-label=\"link\" data-track-item_id=\"link\" rel=\"nofollow noopener\" aria-label=\"Google Scholar reference 12\" href=\"http:\/\/scholar.google.com\/scholar_lookup?&amp;title=Can%20large%20language%20models%20provide%20useful%20feedback%20on%20research%20papers%3F%20a%20large-scale%20empirical%20analysis&amp;journal=NEJM%20AI&amp;doi=10.1056%2FAIoa2400196&amp;volume=1&amp;publication_year=2024&amp;author=Liang%2CW\" target=\"_blank\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR13\">Yuksekgonul, M. et al. Optimizing generative AI by backpropagating language model feedback. Nature 639, 609\u2013616 (2025).<\/p>\n<p class=\"c-article-references__links u-hide-print\"><a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"10.1038\/s41586-025-08661-4\" data-track-item_id=\"10.1038\/s41586-025-08661-4\" data-track-value=\"article reference\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1038%2Fs41586-025-08661-4\" aria-label=\"Article reference 13\" data-doi=\"10.1038\/s41586-025-08661-4\" target=\"_blank\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" data-track-action=\"google scholar reference\" data-track-value=\"google scholar reference\" data-track-label=\"link\" data-track-item_id=\"link\" rel=\"nofollow noopener\" aria-label=\"Google Scholar reference 13\" href=\"http:\/\/scholar.google.com\/scholar_lookup?&amp;title=Optimizing%20generative%20AI%20by%20backpropagating%20language%20model%20feedback&amp;journal=Nature&amp;doi=10.1038%2Fs41586-025-08661-4&amp;volume=639&amp;pages=609-616&amp;publication_year=2025&amp;author=Yuksekgonul%2CM\" target=\"_blank\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR14\">Madaan, A. et al. Self-refi ne: iterative refi nement with self-feedback. Adv. Neural Inf. Process. Syst. 36, 46534\u201346594 (2023).<\/p>\n<p class=\"c-article-references__links u-hide-print\"><a data-track=\"click_references\" data-track-action=\"google scholar reference\" data-track-value=\"google scholar reference\" data-track-label=\"link\" data-track-item_id=\"link\" rel=\"nofollow noopener\" aria-label=\"Google Scholar reference 14\" href=\"http:\/\/scholar.google.com\/scholar_lookup?&amp;title=Self-refi%20ne%3A%20iterative%20refi%20nement%20with%20self-feedback&amp;journal=Adv.%20Neural%20Inf.%20Process.%20Syst.&amp;volume=36&amp;pages=46534-46594&amp;publication_year=2023&amp;author=Madaan%2CA\" target=\"_blank\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR15\">Hosseini, M. &amp; Horbach, S. P. J. M. Fighting reviewer fatigue or amplifying bias? Considerations and recommendations for use of GhatGPT and other large language models in scholarly peer review. Res. Integr. Peer Rev. 8, 4 (2023).<\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR16\">Liang, W. et al. Monitoring AI-modified content at scale: a case study on the impact of ChatGPT on AI conference peer reviews. In Proc. 41st International Conference on Machine Learning 29575\u201329620 (ICML, 2024).<\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR17\">Zhang, Y. et al. Siren\u2019s song in the AI ocean: a survey on hallucination in large language models. Computational Linguistics 51, 1373\u20131418 (2025).<\/p>\n<p class=\"c-article-references__links u-hide-print\"><a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"10.1162\/COLI.a.16\" data-track-item_id=\"10.1162\/COLI.a.16\" data-track-value=\"article reference\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1162%2FCOLI.a.16\" aria-label=\"Article reference 17\" data-doi=\"10.1162\/COLI.a.16\" target=\"_blank\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" data-track-action=\"google scholar reference\" data-track-value=\"google scholar reference\" data-track-label=\"link\" data-track-item_id=\"link\" rel=\"nofollow noopener\" aria-label=\"Google Scholar reference 17\" href=\"http:\/\/scholar.google.com\/scholar_lookup?&amp;title=Siren%E2%80%99s%20song%20in%20the%20AI%20ocean%3A%20a%20survey%20on%20hallucination%20in%20large%20language%20models&amp;journal=Computational%20Linguistics&amp;doi=10.1162%2FCOLI.a.16&amp;volume=51&amp;pages=1373-1418&amp;publication_year=2025&amp;author=Zhang%2CY\" target=\"_blank\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR18\">Zhou, J. et al. Instruction-following evaluation for large language models. Preprint at <a href=\"https:\/\/doi.org\/10.48550\/arXiv.2311.07911\" data-track=\"click_references\" data-track-action=\"external reference\" data-track-value=\"external reference\" data-track-label=\"10.48550\/arXiv.2311.07911\" rel=\"nofollow noopener\" target=\"_blank\">https:\/\/doi.org\/10.48550\/arXiv.2311.07911<\/a> (2023).<\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR19\">Liu, R. &amp; Shah, N. B. ReviewerGPT? an exploratory study on using large language models for paper reviewing. Preprint at <a href=\"https:\/\/doi.org\/10.48550\/arXiv.2306.00622\" data-track=\"click_references\" data-track-action=\"external reference\" data-track-value=\"external reference\" data-track-label=\"10.48550\/arXiv.2306.00622\" rel=\"nofollow noopener\" target=\"_blank\">https:\/\/doi.org\/10.48550\/arXiv.2306.00622<\/a> (2023).<\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR20\">Biswas, S., Dobaria, D. &amp; Cohen, H. L. ChatGPTand the future of journal reviews: a feasibility study. Yale J. Biol. Med. 96, 415\u2013420 (2023).<\/p>\n<p class=\"c-article-references__links u-hide-print\"><a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"10.59249\/SKDH9286\" data-track-item_id=\"10.59249\/SKDH9286\" data-track-value=\"article reference\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.59249%2FSKDH9286\" aria-label=\"Article reference 20\" data-doi=\"10.59249\/SKDH9286\" target=\"_blank\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" data-track-action=\"google scholar reference\" data-track-value=\"google scholar reference\" data-track-label=\"link\" data-track-item_id=\"link\" rel=\"nofollow noopener\" aria-label=\"Google Scholar reference 20\" href=\"http:\/\/scholar.google.com\/scholar_lookup?&amp;title=ChatGPTand%20the%20future%20of%20journal%20reviews%3A%20a%20feasibility%20study&amp;journal=Yale%20J.%20Biol.%20Med.&amp;doi=10.59249%2FSKDH9286&amp;volume=96&amp;pages=415-420&amp;publication_year=2023&amp;author=Biswas%2CS&amp;author=Dobaria%2CD&amp;author=Cohen%2CHL\" target=\"_blank\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR21\">Liang, W. et al. Mapping the increasing use of LLMs in scientific papers. In Proc. 1st Conference on Language Modeling (COLM) (2024).<\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR22\">Shah, N. B. Challenges, experiments, and computational solutions in peer review. Commun. ACM 65, 76\u201387 (2022).<\/p>\n<p class=\"c-article-references__links u-hide-print\"><a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"10.1145\/3528086\" data-track-item_id=\"10.1145\/3528086\" data-track-value=\"article reference\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1145%2F3528086\" aria-label=\"Article reference 22\" data-doi=\"10.1145\/3528086\" target=\"_blank\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" data-track-action=\"google scholar reference\" data-track-value=\"google scholar reference\" data-track-label=\"link\" data-track-item_id=\"link\" rel=\"nofollow noopener\" aria-label=\"Google Scholar reference 22\" href=\"http:\/\/scholar.google.com\/scholar_lookup?&amp;title=Challenges%2C%20experiments%2C%20and%20computational%20solutions%20in%20peer%20review&amp;journal=Commun.%20ACM&amp;doi=10.1145%2F3528086&amp;volume=65&amp;pages=76-87&amp;publication_year=2022&amp;author=Shah%2CNB\" target=\"_blank\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR23\">Price, S. &amp; Flach, P. A. Computational support for academic peer review: a perspective from artificial intelligence. Commun. ACM 60, 70\u201379 (2017).<\/p>\n<p class=\"c-article-references__links u-hide-print\"><a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"10.1145\/2979672\" data-track-item_id=\"10.1145\/2979672\" data-track-value=\"article reference\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1145%2F2979672\" aria-label=\"Article reference 23\" data-doi=\"10.1145\/2979672\" target=\"_blank\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" data-track-action=\"google scholar reference\" data-track-value=\"google scholar reference\" data-track-label=\"link\" data-track-item_id=\"link\" rel=\"nofollow noopener\" aria-label=\"Google Scholar reference 23\" href=\"http:\/\/scholar.google.com\/scholar_lookup?&amp;title=Computational%20support%20for%20academic%20peer%20review%3A%20a%20perspective%20from%20artificial%20intelligence&amp;journal=Commun.%20ACM&amp;doi=10.1145%2F2979672&amp;volume=60&amp;pages=70-79&amp;publication_year=2017&amp;author=Price%2CS&amp;author=Flach%2CPA\" target=\"_blank\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR24\">Kankanhalli, A. Peer review in the age of generative AI. J. Assoc. Inf. Syst. 25, 76\u201384 (2024).<\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR25\">Kuznetsov, I. et al. What can natural language processing do for peer review? Preprint at <a href=\"https:\/\/doi.org\/10.48550\/arXiv.2405.06563\" data-track=\"click_references\" data-track-action=\"external reference\" data-track-value=\"external reference\" data-track-label=\"10.48550\/arXiv.2405.06563\" rel=\"nofollow noopener\" target=\"_blank\">https:\/\/doi.org\/10.48550\/arXiv.2405.06563<\/a> (2024).<\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR26\">Leung, T. I., Taiane de Azevedo, C., Mavragani, A. &amp; Eysenbach, G. Best practices for using AI tools as an author, peer reviewer, or editor. J. Med. Internet Res. 25, e51584 (2023).<\/p>\n<p class=\"c-article-references__links u-hide-print\"><a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"10.2196\/51584\" data-track-item_id=\"10.2196\/51584\" data-track-value=\"article reference\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.2196%2F51584\" aria-label=\"Article reference 26\" data-doi=\"10.2196\/51584\" target=\"_blank\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" data-track-action=\"google scholar reference\" data-track-value=\"google scholar reference\" data-track-label=\"link\" data-track-item_id=\"link\" rel=\"nofollow noopener\" aria-label=\"Google Scholar reference 26\" href=\"http:\/\/scholar.google.com\/scholar_lookup?&amp;title=Best%20practices%20for%20using%20AI%20tools%20as%20an%20author%2C%20peer%20reviewer%2C%20or%20editor&amp;journal=J.%20Med.%20Internet%20Res.&amp;doi=10.2196%2F51584&amp;volume=25&amp;publication_year=2023&amp;author=Leung%2CTI&amp;author=Taiane%20de%20Azevedo%2CC&amp;author=Mavragani%2CA&amp;author=Eysenbach%2CG\" target=\"_blank\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR27\">Checco, A., Bracciale, L., Loreti, P., Pinfield, S. &amp; Bianchi, G. AI-assisted peer review. Humanit. Soc. Sci. Commun. 8, 25 (2021).<\/p>\n<p class=\"c-article-references__links u-hide-print\"><a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"10.1057\/s41599-020-00703-8\" data-track-item_id=\"10.1057\/s41599-020-00703-8\" data-track-value=\"article reference\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1057%2Fs41599-020-00703-8\" aria-label=\"Article reference 27\" data-doi=\"10.1057\/s41599-020-00703-8\" target=\"_blank\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" data-track-action=\"google scholar reference\" data-track-value=\"google scholar reference\" data-track-label=\"link\" data-track-item_id=\"link\" rel=\"nofollow noopener\" aria-label=\"Google Scholar reference 27\" href=\"http:\/\/scholar.google.com\/scholar_lookup?&amp;title=AI-assisted%20peer%20review&amp;journal=Humanit.%20Soc.%20Sci.%20Commun.&amp;doi=10.1057%2Fs41599-020-00703-8&amp;volume=8&amp;publication_year=2021&amp;author=Checco%2CA&amp;author=Bracciale%2CL&amp;author=Loreti%2CP&amp;author=Pinfield%2CS&amp;author=Bianchi%2CG\" target=\"_blank\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR28\">Kousha, K. &amp; Thelwall, M. Artificial intelligence to support publishing and peer review: a summary and review. Learn. Publ. 37, 4\u201312 (2024).<\/p>\n<p class=\"c-article-references__links u-hide-print\"><a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"10.1002\/leap.1570\" data-track-item_id=\"10.1002\/leap.1570\" data-track-value=\"article reference\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1002%2Fleap.1570\" aria-label=\"Article reference 28\" data-doi=\"10.1002\/leap.1570\" target=\"_blank\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" data-track-action=\"google scholar reference\" data-track-value=\"google scholar reference\" data-track-label=\"link\" data-track-item_id=\"link\" rel=\"nofollow noopener\" aria-label=\"Google Scholar reference 28\" href=\"http:\/\/scholar.google.com\/scholar_lookup?&amp;title=Artificial%20intelligence%20to%20support%20publishing%20and%20peer%20review%3A%20a%20summary%20and%20review&amp;journal=Learn.%20Publ.&amp;doi=10.1002%2Fleap.1570&amp;volume=37&amp;pages=4-12&amp;publication_year=2024&amp;author=Kousha%2CK&amp;author=Thelwall%2CM\" target=\"_blank\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR29\">Goldberg, A. et al. Usefulness of LLMs as an author checklist assistant for scientific papers: NeurIPS\u201924 experiment. Preprint at <a href=\"https:\/\/doi.org\/10.48550\/arXiv.2411.03417\" data-track=\"click_references\" data-track-action=\"external reference\" data-track-value=\"external reference\" data-track-label=\"10.48550\/arXiv.2411.03417\" rel=\"nofollow noopener\" target=\"_blank\">https:\/\/doi.org\/10.48550\/arXiv.2411.03417<\/a> (2024).<\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR30\">Su, X., Wambsganss, T., Rietsche, R., Neshaei, S. P. &amp; K\u00e4ser, T. Reviewriter: AI-generated instructions for peer review writing. In Proc. 18th Workshop on Innovative Use of NLP for Building Educational Applications (eds Kochmar, E. et al.) 57\u201371 (ACL, 2023).<\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR31\">D\u2019Arcy, M., Hope, T., Birnbaum, L. &amp; Downey, D. MARG: multi-agent review generation for scientific papers. Preprint at <a href=\"https:\/\/doi.org\/10.48550\/arXiv.2401.04259\" data-track=\"click_references\" data-track-action=\"external reference\" data-track-value=\"external reference\" data-track-label=\"10.48550\/arXiv.2401.04259\" rel=\"nofollow noopener\" target=\"_blank\">https:\/\/doi.org\/10.48550\/arXiv.2401.04259<\/a> (2024).<\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR32\">GPT-4 Technical Report (OpenAI, 2024).<\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR33\">Goldberg, A. et al. Peer reviews of peer reviews: a randomized controlled trial and other experiments. PLoS ONE 20, e0320444 (2025).<\/p>\n<p class=\"c-article-references__links u-hide-print\"><a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"10.1371\/journal.pone.0320444\" data-track-item_id=\"10.1371\/journal.pone.0320444\" data-track-value=\"article reference\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1371%2Fjournal.pone.0320444\" aria-label=\"Article reference 33\" data-doi=\"10.1371\/journal.pone.0320444\" target=\"_blank\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" data-track-action=\"google scholar reference\" data-track-value=\"google scholar reference\" data-track-label=\"link\" data-track-item_id=\"link\" rel=\"nofollow noopener\" aria-label=\"Google Scholar reference 33\" href=\"http:\/\/scholar.google.com\/scholar_lookup?&amp;title=Peer%20reviews%20of%20peer%20reviews%3A%20a%20randomized%20controlled%20trial%20and%20other%20experiments&amp;journal=PLoS%20ONE&amp;doi=10.1371%2Fjournal.pone.0320444&amp;volume=20&amp;publication_year=2025&amp;author=Goldberg%2CA\" target=\"_blank\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR34\">Kocak, B., Onur, M. R., Park, S. H., Baltzer, P. &amp; Dietzel, M. Ensuring peer review integrity in the era of large language models: a critical stocktaking of challenges, red flags, and recommendations. Eur. J. Radiol. Artif. Intell. 2, 100018 (2025).<\/p>\n<p class=\"c-article-references__links u-hide-print\"><a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"10.1016\/j.ejrai.2025.100018\" data-track-item_id=\"10.1016\/j.ejrai.2025.100018\" data-track-value=\"article reference\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.1016%2Fj.ejrai.2025.100018\" aria-label=\"Article reference 34\" data-doi=\"10.1016\/j.ejrai.2025.100018\" target=\"_blank\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" data-track-action=\"google scholar reference\" data-track-value=\"google scholar reference\" data-track-label=\"link\" data-track-item_id=\"link\" rel=\"nofollow noopener\" aria-label=\"Google Scholar reference 34\" href=\"http:\/\/scholar.google.com\/scholar_lookup?&amp;title=Ensuring%20peer%20review%20integrity%20in%20the%20era%20of%20large%20language%20models%3A%20a%20critical%20stocktaking%20of%20challenges%2C%20red%20flags%2C%20and%20recommendations&amp;journal=Eur.%20J.%20Radiol.%20Artif.%20Intell.&amp;doi=10.1016%2Fj.ejrai.2025.100018&amp;volume=2&amp;publication_year=2025&amp;author=Kocak%2CB&amp;author=Onur%2CMR&amp;author=Park%2CSH&amp;author=Baltzer%2CP&amp;author=Dietzel%2CM\" target=\"_blank\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR35\">Ye, R. et al. Are we there yet? Revealing the risks of utilizing large language models in scholarly peer review. Preprint at <a href=\"https:\/\/doi.org\/10.48550\/arXiv.2412.01708\" data-track=\"click_references\" data-track-action=\"external reference\" data-track-value=\"external reference\" data-track-label=\"10.48550\/arXiv.2412.01708\" rel=\"nofollow noopener\" target=\"_blank\">https:\/\/doi.org\/10.48550\/arXiv.2412.01708<\/a> (2024).<\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR36\">Shin, H. et al. Mind the blind spots: a focus-level evaluation framework for LLM reviews. In Proc. Conference on Empirical Methods in Natural Language Processing 35630\u201335656 (EMNLP, 2025).<\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR37\">Luo, M. et al. Benchmark on peer review toxic detection: a challenging task with a new dataset. Preprint at <a href=\"https:\/\/doi.org\/10.48550\/arXiv.2502.01676\" data-track=\"click_references\" data-track-action=\"external reference\" data-track-value=\"external reference\" data-track-label=\"10.48550\/arXiv.2502.01676\" rel=\"nofollow noopener\" target=\"_blank\">https:\/\/doi.org\/10.48550\/arXiv.2502.01676<\/a> (2025).<\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR38\">Tamkin, A. et al. Clio: privacy-preserving insights into real-world AI use. Preprint at <a href=\"https:\/\/doi.org\/10.48550\/arXiv.2412.13678\" data-track=\"click_references\" data-track-action=\"external reference\" data-track-value=\"external reference\" data-track-label=\"10.48550\/arXiv.2412.13678\" rel=\"nofollow noopener\" target=\"_blank\">https:\/\/doi.org\/10.48550\/arXiv.2412.13678<\/a> (2024).<\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR39\">Saad-Falcon, J. et al. LMUnit: fine-grained evaluation with natural language unit tests. In Findings of the Association for Computational Linguistics 3303\u20133324 (ACL, 2025).<\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR40\">Prasad, A., Stengel-Eskin, E., Chen, J. C.-Y., Khan, Z. &amp; Bansal, M. Learning to generate unit tests for automated debugging. Preprint at <a href=\"https:\/\/doi.org\/10.48550\/arXiv.2502.01619\" data-track=\"click_references\" data-track-action=\"external reference\" data-track-value=\"external reference\" data-track-label=\"10.48550\/arXiv.2502.01619\" rel=\"nofollow noopener\" target=\"_blank\">https:\/\/doi.org\/10.48550\/arXiv.2502.01619<\/a> (2025).<\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR41\">Charlin, L., Zemel, R. S. &amp; Boutilier, C. A framework for optimizing paper matching. In Proc. 27th Conference on Uncertainty in Artificial Intelligence 11, 86\u201395 (AUAI Press, 2011).<\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR42\">ICML 2023 Reviewer Tutorial (ICML 2023 Program Committee, 2023).<\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR43\">How to Be a Good Reviewer? Reviewer Tutorial for ICML 2022 (ICML 2022 Program Chairs, 2022).<\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR44\">Last Minute Reviewing Advice (ACL PC Chairs, 2017).<\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR45\">Valdenegro, M. LXCV @ CVPR 2021 Reviewer Mentoring Program: and How to Write Good Reviews. Presentation at LatinX in Computer Vision (LXCV) Workshop, CVPR 2021 (2021).<\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR46\">Rogers, A. ARR Reviewer Guidelines (Association for Computational Linguistics, 2021).<\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR47\">Silbiger, N. J. &amp; Stubler, A. D. Unprofessional peer reviews disproportionately harm underrepresented groups in stem. PeerJ 7, e8247 (2019).<\/p>\n<p class=\"c-article-references__links u-hide-print\"><a data-track=\"click_references\" rel=\"nofollow noopener\" data-track-label=\"10.7717\/peerj.8247\" data-track-item_id=\"10.7717\/peerj.8247\" data-track-value=\"article reference\" data-track-action=\"article reference\" href=\"https:\/\/doi.org\/10.7717%2Fpeerj.8247\" aria-label=\"Article reference 47\" data-doi=\"10.7717\/peerj.8247\" target=\"_blank\">Article<\/a>\u00a0<br \/>\n    <a data-track=\"click_references\" data-track-action=\"google scholar reference\" data-track-value=\"google scholar reference\" data-track-label=\"link\" data-track-item_id=\"link\" rel=\"nofollow noopener\" aria-label=\"Google Scholar reference 47\" href=\"http:\/\/scholar.google.com\/scholar_lookup?&amp;title=Unprofessional%20peer%20reviews%20disproportionately%20harm%20underrepresented%20groups%20in%20stem&amp;journal=PeerJ&amp;doi=10.7717%2Fpeerj.8247&amp;volume=7&amp;publication_year=2019&amp;author=Silbiger%2CNJ&amp;author=Stubler%2CAD\" target=\"_blank\"><br \/>\n                    Google Scholar<\/a>\u00a0\n                <\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR48\">Fenniak, M. et al. The PyPDF library. <a href=\"https:\/\/blog.iclr.cc\/2024\/10\/09\/iclr2025-assisting-reviewers\/\" data-track=\"click_references\" data-track-action=\"external reference\" data-track-value=\"external reference\" data-track-label=\"https:\/\/blog.iclr.cc\/2024\/10\/09\/iclr2025-assisting-reviewers\/\" rel=\"nofollow noopener\" target=\"_blank\">https:\/\/pypi.org\/project\/pypdf\/<\/a> (2024).<\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR49\">Ribeiro, M. T. &amp; Lundberg, S. Testing Language Models (and Prompts) Like We Test Software (Medium, 2023).<\/p>\n<p class=\"c-article-references__text\" id=\"ref-CR50\">Thakkar, N. zou-group\/review_feedback_agent: first release. Zenodo <a href=\"https:\/\/doi.org\/10.5281\/zenodo.17903957\" data-track=\"click_references\" data-track-action=\"external reference\" data-track-value=\"external reference\" data-track-label=\"10.5281\/zenodo.17903957\" rel=\"nofollow noopener\" target=\"_blank\">https:\/\/doi.org\/10.5281\/zenodo.17903957<\/a> (2025).<\/p>\n","protected":false},"excerpt":{"rendered":"Alberts, B., Hanson, B. &amp; Kelner, K. L. Editorial: reviewing peer review. Science 321, 15\u201315 (2008). Article\u00a0 Google&hellip;\n","protected":false},"author":2,"featured_media":443243,"comment_status":"","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[7],"tags":[6112,4679,59,3250,90,8360,56,54,55],"class_list":{"0":"post-443242","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-science","8":"tag-computer-science","9":"tag-engineering","10":"tag-gb","11":"tag-general","12":"tag-science","13":"tag-scientific-community","14":"tag-uk","15":"tag-united-kingdom","16":"tag-unitedkingdom"},"_links":{"self":[{"href":"https:\/\/www.newsbeep.com\/uk\/wp-json\/wp\/v2\/posts\/443242","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.newsbeep.com\/uk\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.newsbeep.com\/uk\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/uk\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/uk\/wp-json\/wp\/v2\/comments?post=443242"}],"version-history":[{"count":0,"href":"https:\/\/www.newsbeep.com\/uk\/wp-json\/wp\/v2\/posts\/443242\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/uk\/wp-json\/wp\/v2\/media\/443243"}],"wp:attachment":[{"href":"https:\/\/www.newsbeep.com\/uk\/wp-json\/wp\/v2\/media?parent=443242"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.newsbeep.com\/uk\/wp-json\/wp\/v2\/categories?post=443242"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.newsbeep.com\/uk\/wp-json\/wp\/v2\/tags?post=443242"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}