{"id":70677,"date":"2025-08-15T03:47:17","date_gmt":"2025-08-15T03:47:17","guid":{"rendered":"https:\/\/www.newsbeep.com\/ca\/70677\/"},"modified":"2025-08-15T03:47:17","modified_gmt":"2025-08-15T03:47:17","slug":"olympiad-level-ai-performance-is-here-what-now","status":"publish","type":"post","link":"https:\/\/www.newsbeep.com\/ca\/70677\/","title":{"rendered":"Olympiad-Level AI Performance Is Here\u2014What Now?"},"content":{"rendered":"<p><a href=\"https:\/\/physics.aps.org\/authors\/paul_tschisgale\" rel=\"nofollow noopener\" target=\"_blank\">Paul Tschisgale<\/a>Department of Physics Education, Leibniz Institute for Science and Mathematics Education, Kiel, Germany<\/p>\n<p>August 13, 2025&amp;bullet;  Physics 18, 147<\/p>\n<p>As large language models improve, the real challenge is not how to shield education from AI, but how to embrace AI as a cornerstone of future physics learning and teaching.<\/p>\n<p><a data-reveal-id=\"figure-modal-1\" href=\"https:\/\/www.newsbeep.com\/ca\/wp-content\/uploads\/2025\/08\/e147_2.png\"><img decoding=\"async\" alt=\"Figure caption\" src=\"https:\/\/www.newsbeep.com\/ca\/wp-content\/uploads\/2025\/08\/e147_2_medium.png\"\/><\/a><img decoding=\"async\" alt=\"expand figure\" class=\"figure-expander\" src=\"https:\/\/cdn.journals.aps.org\/development\/physics\/images\/icon-expand.svg\"\/><\/p>\n<p>P. Tschisgale\/IPN; Figure created using the GPT-4o image generator (OpenAI)<\/p>\n<p><a data-reveal-id=\"figure-modal-1\" href=\"https:\/\/www.newsbeep.com\/ca\/wp-content\/uploads\/2025\/08\/e147_2.png\"> If AI openly competed in a Physics Olympiad, it would likely win medals and disappoint its human adversaries.<\/p>\n<p><img decoding=\"async\" alt=\"Figure caption\" src=\"https:\/\/www.newsbeep.com\/ca\/wp-content\/uploads\/2025\/08\/e147_2.png\"\/><\/p>\n<p>P. Tschisgale\/IPN; Figure created using the GPT-4o image generator (OpenAI)<\/p>\n<p> If AI openly competed in a Physics Olympiad, it would likely win medals and disappoint its human adversaries.<a aria-label=\"Close\" class=\"close-reveal-modal\">\u00d7<\/a><\/p>\n<p id=\"d5e99\">Since large language models (LLMs)\u2014a prominent type of AI\u2014became widely available to the public, their burgeoning capabilities have sparked both fascination and concern across many fields. As their capabilities in physics became more apparent, I began to wonder what this development might mean for settings where individual expertise is supposed to shine. In 2024, I completed my PhD on how students engage in high-level problem solving, particularly in the context of the German Physics Olympiad, a multiround competition where highly motivated students work on challenging physics problems beyond the standard curriculum. From this perspective, a worry emerged: The Physics Olympiad represents a setting where LLMs could be used quietly but effectively, raising difficult questions about whether the competition is fair and its integrity can still be upheld.<\/p>\n<p id=\"d5e101\">If AI models could solve Olympiad-level physics problems as well\u2014or even better\u2014than the Olympiad participants themselves, the Olympiad would no longer reward deep understanding or genuine effort. Instead, it would risk rewarding those who relied on LLMs, regardless of their own level of expertise. To better understand the scope of the potential problem, my colleagues and I set out to test just how well contemporary LLMs perform on Olympiad-level physics problems. In our study, we evaluated, using actual problems from the German Physics Olympiad, two advanced LLMs: GPT-4o, the previous default model behind ChatGPT, and o1-preview, a newer model optimized for reasoning [<a href=\"#c1\" class=\"ref-target inline-ref-target\" data-ref-target=\"c1\">1<\/a>].<\/p>\n<p id=\"d5e106\">Before conducting the study, I already expected the LLMs to do reasonably well. Previous studies had already shown that LLMs could answer standard physics questions and solve problems at the high school or early university level. But I was taken aback by how well they performed on Olympiad-level problems\u2014problems designed to challenge some of the best students in the country. GPT-4o outperformed the average human participant, and the newer o1-preview model did even better.<\/p>\n<p id=\"d5e108\">If LLMs can produce high-quality solutions on a par with or better than those of top students, then any observed performance in unsupervised settings\u2014be it homework rounds of a competition, homework assignments, or online exams\u2014may be suspect. This new reality challenges the validity of many current assessment formats and forces us to reconsider not only how we measure physics expertise but also what kinds of knowledge and abilities we want students to develop in the first place. How should physics education respond to this?<\/p>\n<p id=\"d5e110\">One possible response might be to ban the use of AI in educational settings and enforce this using detection tools. But this is unlikely to succeed, as this would set up an ongoing arms race between increasingly sophisticated LLMs and the tools designed to detect their output. Detection methods will almost always be one step behind, making it difficult to reliably distinguish between human- and AI-generated work. Another conceivable approach might be to rely more on physics problems in assessment situations that exploit current weaknesses of LLMs\u2014for example, problems that require interpreting diagrams. Yet this is a short-term fix at best, as such weaknesses may soon disappear. To ensure that what we evaluate still reflects students\u2019 own thinking, we may need to rely more heavily on supervised formats, such as oral exams or in-person written assessments. These formats, however, would demand significantly more resources.<\/p>\n<p id=\"d5e112\">But instead of focusing solely on mitigating the risks of AI, shouldn\u2019t we be asking another question? Why not let students use AI and focus on teaching them to do so thoughtfully and responsibly? AI is here to stay, and it will play a major role in many students\u2019 academic and professional futures. We should equip students to work with AI tools such as LLMs because the ability to use such powerful tools may soon be as important as mastering a subject matter itself.<\/p>\n<p id=\"d5e114\">As AI continues to improve, it may seem that we\u2019re entering an era where it might seem that students no longer need to memorize formulas or solve complex equations by hand\u2014because AI can do so faster, and often better. Yet this view is ultimately too simplistic. AI models still make mistakes, just like humans do. However, these mistakes are often hard to spot because the models present their responses in the polished language of experts. That\u2019s why students still need a solid foundation in physics\u2014to tell sound reasoning from superficial gloss.<\/p>\n<p id=\"d5e116\">What\u2019s needed is a shift in educational priorities. This means not only teaching physics content but also helping students develop the ability to critically evaluate solutions\u2014especially those generated by AI. In many ways, this mirrors how we already approach solving problems collaboratively. In such settings, students don\u2019t always complete every step alone; they question, reflect, and build on shared inputs. Interacting with an LLM should be no different. The LLM may offer suggestions, but it\u2019s the student\u2019s responsibility to judge, refine, and, if needed, challenge those suggestions.<\/p>\n<p id=\"d5e118\">That kind of human\u2013AI collaboration is something education should be working toward. In this vision, physics education remains grounded in teaching students conceptual knowledge and basic problem-solving strategies. But it places greater emphasis on critical thinking, reflective judgment, and the ability to engage productively with AI. Students still need a strong foundation in physics\u2014but the way they apply their knowledge is evolving. Rather than competing with AI, they\u2019ll collaborate with it, drawing on its strengths, while compensating for its limitations. That\u2019s the future we should be teaching toward.<\/p>\n<p>ReferencesP. Tschisgale et al., \u201cEvaluating GPT- and reasoning-based large language models on Physics Olympiad problems: Surpassing human performance and implications for educational assessment,\u201d <a href=\"http:\/\/dx.doi.org\/10.1103\/6fmx-bsnl\" rel=\"nofollow noopener\" target=\"_blank\">Phys. Rev. Phys. Educ. Res. 21, 020115 (2025)<\/a>.About the Author<img decoding=\"async\" alt=\"Image of Paul Tschisgale\" src=\"https:\/\/www.newsbeep.com\/ca\/wp-content\/uploads\/2025\/08\/384ae3c2-a629-4026-8ddc-da902bff4789.png\" width=\"125\"\/><\/p>\n<p>Paul Tschisgale is a postdoctoral researcher at the Leibniz Institute for Science and Mathematics Education in Kiel, Germany. He earned his PhD in physics education at Kiel University, Germany, in 2024. His research focuses on nurturing high-ability students and on using AI to improve physics learning, with an emphasis on the assessment and development of physics problem-solving abilities.<\/p>\n<p>Recent Articles<a href=\"https:\/\/physics.aps.org\/articles\/v18\/s110\" rel=\"nofollow noopener\" target=\"_blank\"><img decoding=\"async\" alt=\"Reducing the Number of Wires on a Quantum Chip\" src=\"https:\/\/www.newsbeep.com\/ca\/wp-content\/uploads\/2025\/08\/1755229635_909_large.png\"\/><\/a><a href=\"https:\/\/physics.aps.org\/articles\/v18\/s99\" rel=\"nofollow noopener\" target=\"_blank\"><img decoding=\"async\" alt=\"Magnetic Topological Insulators Have an Edgy Side\" src=\"https:\/\/www.newsbeep.com\/ca\/wp-content\/uploads\/2025\/08\/1755229635_198_large.png\"\/><\/a><a href=\"https:\/\/physics.aps.org\/articles\/v18\/s101\" rel=\"nofollow noopener\" target=\"_blank\"><img decoding=\"async\" alt=\"Crushing Oxygen into a Spin Liquid\" src=\"https:\/\/www.newsbeep.com\/ca\/wp-content\/uploads\/2025\/08\/1755199514_301_large.png\"\/><\/a><a class=\"large button\" href=\"https:\/\/physics.aps.org\/browse\" rel=\"nofollow noopener\" target=\"_blank\"> More Articles<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"Paul TschisgaleDepartment of Physics Education, Leibniz Institute for Science and Mathematics Education, Kiel, Germany August 13, 2025&amp;bullet; Physics&hellip;\n","protected":false},"author":2,"featured_media":70678,"comment_status":"","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[24],"tags":[49,48,314,66],"class_list":{"0":"post-70677","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-physics","8":"tag-ca","9":"tag-canada","10":"tag-physics","11":"tag-science"},"_links":{"self":[{"href":"https:\/\/www.newsbeep.com\/ca\/wp-json\/wp\/v2\/posts\/70677","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.newsbeep.com\/ca\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.newsbeep.com\/ca\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/ca\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/ca\/wp-json\/wp\/v2\/comments?post=70677"}],"version-history":[{"count":0,"href":"https:\/\/www.newsbeep.com\/ca\/wp-json\/wp\/v2\/posts\/70677\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/ca\/wp-json\/wp\/v2\/media\/70678"}],"wp:attachment":[{"href":"https:\/\/www.newsbeep.com\/ca\/wp-json\/wp\/v2\/media?parent=70677"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.newsbeep.com\/ca\/wp-json\/wp\/v2\/categories?post=70677"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.newsbeep.com\/ca\/wp-json\/wp\/v2\/tags?post=70677"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}