{"id":182396,"date":"2025-12-13T10:19:12","date_gmt":"2025-12-13T10:19:12","guid":{"rendered":"https:\/\/www.newsbeep.com\/il\/182396\/"},"modified":"2025-12-13T10:19:12","modified_gmt":"2025-12-13T10:19:12","slug":"google-researchers-find-the-best-ai-model-is-69-right","status":"publish","type":"post","link":"https:\/\/www.newsbeep.com\/il\/182396\/","title":{"rendered":"Google Researchers Find the Best AI Model Is 69% Right"},"content":{"rendered":"<p>                      2025-12-12T21:26:30.691Z<\/p>\n<p>                              Share<\/p>\n<p>                            Facebook<br \/>\n                                  <a class=\"dropdown-item label-md\" data-email=\"\" href=\"mailto:?subject=Google researchers find the best AI model is 69% right&amp;body=Google%20researchers%20find%20the%20best%20AI%20model%20is%2069%25%20right%0D%0A%0D%0Ahttps%3A%2F%2Fwww.businessinsider.com%2Fgoogle-researchers-find-best-ai-model-69-right-2025-12&amp;\" title=\"Email\" aria-label=\"Click to email\" data-utm-term=\"\" data-track-click=\"{&quot;click_url&quot;:&quot;mailto:?subject=Google researchers find the best AI model is 69% right&amp;body=Google%20researchers%20find%20the%20best%20AI%20model%20is%2069%25%20right%0D%0A%0D%0Ahttps%3A%2F%2Fwww.businessinsider.com%2Fgoogle-researchers-find-best-ai-model-69-right-2025-12&amp;&quot;,&quot;share_type&quot;:&quot;email&quot;,&quot;element_name&quot;:&quot;sharebar&quot;}\"><\/p>\n<p>                            Email<br \/>\n                          <\/a>        <\/p>\n<p>                            X<\/p>\n<p>                            LinkedIn<\/p>\n<p>                            Reddit<\/p>\n<p>                            Bluesky<\/p>\n<p>                            WhatsApp<\/p>\n<p>                            Copy link<\/p>\n<p>                            lighning bolt icon<br \/>\n                            An icon in the shape of a lightning bolt.<\/p>\n<p>                            Impact Link<\/p>\n<p>                        Save<br \/>\n                        Saved<\/p>\n<p>                  <a class=\"d-md-none app-button\" data-app-button=\"\" data-only-on=\"mobile\" data-component-type=\"app-button\" data-load-strategy=\"lazy\" title=\"Download the app\" aria-label=\"Click to download the app\" target=\"_blank\" href=\"https:\/\/insider-app.onelink.me\/4cpG\/?af_js_web=true&amp;af_ss_ver=2_3_0&amp;af_dp=insider%3A%2F%2Fbi%2Fpost%2Fgoogle-researchers-find-best-ai-model-69-right-2025-12&amp;af_force_deeplink=true&amp;is_retargeting=true&amp;deep_link_value=https%3A%2F%2Fwww.businessinsider.com%2Fgoogle-researchers-find-best-ai-model-69-right-2025-12&amp;pid=businessinsider&amp;c=post_page_share_bar_v2_smart_4.13.23\" data-track-click=\"{&quot;event&quot;:&quot;app_cta&quot;,&quot;click_text&quot;:&quot;read_in_app&quot;,&quot;element_name&quot;:&quot;sharebar&quot;}\" rel=\"nofollow noopener\"><\/p>\n<p>                      Read in app<\/p>\n<p>                  <\/a>  <\/p>\n<p>              This story is available exclusively to Business Insider<br \/>\n                subscribers. <a href=\"https:\/\/www.businessinsider.com\/subscription\" class=\"subscription-link\" rel=\"nofollow noopener\" target=\"_blank\">Become an Insider<\/a><br \/>\n                and start reading now.<br \/>\n              Have an account? Log in.<\/p>\n<p>We just got a sobering picture of how often AI models get their facts straight. This week, Google <a target=\"_self\" class=\"\" href=\"https:\/\/www.businessinsider.com\/google-deepmind-cracks-century-old-physics-mystery-ai-fluid-dynamics-2025-11\" data-track-click=\"{&quot;element_name&quot;:&quot;body_link&quot;,&quot;event&quot;:&quot;tout_click&quot;,&quot;index&quot;:&quot;bi_value_unassigned&quot;,&quot;product_field&quot;:&quot;bi_value_unassigned&quot;}\" rel=\"nofollow noopener\">DeepMind<\/a> introduced the <a target=\"_blank\" href=\"https:\/\/storage.googleapis.com\/deepmind-media\/FACTS\/FACTS_benchmark_suite_paper.pdf\" data-track-click=\"{&quot;click_type&quot;:&quot;other&quot;,&quot;element_name&quot;:&quot;body_link&quot;,&quot;event&quot;:&quot;outbound_click&quot;}\" rel=\" nofollow noopener\">FACTS Benchmark Suite<\/a>, which measures how reliably AI models produce factually accurate answers.<\/p>\n<p>It tests models in four areas: answering factoid questions from internal knowledge, using web search effectively, grounding responses in long documents, and interpreting images. The best model, <a target=\"_self\" href=\"https:\/\/www.businessinsider.com\/sundar-pichai-cheeseburger-google-comeback-2025-11\" data-track-click=\"{&quot;element_name&quot;:&quot;body_link&quot;,&quot;event&quot;:&quot;tout_click&quot;,&quot;index&quot;:&quot;bi_value_unassigned&quot;,&quot;product_field&quot;:&quot;bi_value_unassigned&quot;}\" rel=\"nofollow noopener\">Google<\/a>&#8216;s <a target=\"_self\" href=\"https:\/\/www.businessinsider.com\/fun-video-games-google-gemini-3-ai-model-2025-11\" data-track-click=\"{&quot;element_name&quot;:&quot;body_link&quot;,&quot;event&quot;:&quot;tout_click&quot;,&quot;index&quot;:&quot;bi_value_unassigned&quot;,&quot;product_field&quot;:&quot;bi_value_unassigned&quot;}\" rel=\"nofollow noopener\">Gemini 3<\/a> Pro, reached 69% accuracy, with other leading models falling well below that.<\/p>\n<p>For context, if any of the reporters I manage filed stories that were 69% accurate, I would fire them.<\/p>\n<p>Beyond journalism, this number should matter to <a target=\"_self\" href=\"https:\/\/www.businessinsider.com\/ai-tools-used-most-by-companies-chatgpt-claude-copilot-gemini-2025-11\" data-track-click=\"{&quot;element_name&quot;:&quot;body_link&quot;,&quot;event&quot;:&quot;tout_click&quot;,&quot;index&quot;:&quot;bi_value_unassigned&quot;,&quot;product_field&quot;:&quot;bi_value_unassigned&quot;}\" rel=\"nofollow noopener\">businesses betting on AI<\/a>. While models excel at speed and fluency, their factual reliability still lags far behind human expectations, especially in tasks involving niche knowledge, complex reasoning, or precise grounding in source material.<\/p>\n<p>Even small factual errors can have outsized consequences in sectors such as finance, healthcare, and the law. This week, my talented colleague <a target=\"_self\" href=\"https:\/\/www.businessinsider.com\/author\/melia-russell\" data-track-click=\"{&quot;element_name&quot;:&quot;body_link&quot;,&quot;event&quot;:&quot;tout_click&quot;,&quot;index&quot;:&quot;bi_value_unassigned&quot;,&quot;product_field&quot;:&quot;bi_value_unassigned&quot;}\" rel=\"nofollow noopener\">Melia Russell<\/a> looked at how law firms are handling the <a target=\"_self\" href=\"https:\/\/www.businessinsider.com\/lawyers-legal-tech-companies-fight-ai-chatgpt-hallucinations-2025-12\" data-track-click=\"{&quot;element_name&quot;:&quot;body_link&quot;,&quot;event&quot;:&quot;tout_click&quot;,&quot;index&quot;:&quot;bi_value_unassigned&quot;,&quot;product_field&quot;:&quot;bi_value_unassigned&quot;}\" rel=\"nofollow noopener\">rise of AI models as a source of legal truth<\/a>. It&#8217;s messy: She recounts how one firm fired an employee because they filed a document riddled with fake cases after using <a target=\"_self\" href=\"https:\/\/www.businessinsider.com\/chatgpt-economists-secret-prediction-game-openai-2025-12\" data-track-click=\"{&quot;element_name&quot;:&quot;body_link&quot;,&quot;event&quot;:&quot;tout_click&quot;,&quot;index&quot;:&quot;bi_value_unassigned&quot;,&quot;product_field&quot;:&quot;bi_value_unassigned&quot;}\" rel=\"nofollow noopener\">ChatGPT<\/a> to draft it.<\/p>\n<p>The FACTS benchmark is a warning but also a roadmap: by quantifying where and how models fail, Google hopes to accelerate progress. But for now, the takeaway is clear: AI is getting better, but it&#8217;s still wrong about one-third of the time.<\/p>\n<p>Sign up for BI&#8217;s Tech Memo newsletter <a target=\"_self\" href=\"https:\/\/www.businessinsider.com\/subscription\/newsletter\/tech-memo\" data-track-click=\"{&quot;element_name&quot;:&quot;body_link&quot;,&quot;event&quot;:&quot;tout_click&quot;,&quot;index&quot;:&quot;bi_value_unassigned&quot;,&quot;product_field&quot;:&quot;bi_value_unassigned&quot;}\" rel=\"nofollow noopener\">here<\/a>. Reach out to me via email at <a target=\"_blank\" href=\"https:\/\/www.businessinsider.com\/mailto:abarr@businessinsider.com\" data-track-click=\"{&quot;click_type&quot;:&quot;other&quot;,&quot;element_name&quot;:&quot;body_link&quot;,&quot;event&quot;:&quot;outbound_click&quot;}\" rel=\" nofollow noopener\">abarr@businessinsider.com<\/a>.<\/p>\n","protected":false},"excerpt":{"rendered":"2025-12-12T21:26:30.691Z Share Facebook Email X LinkedIn Reddit Bluesky WhatsApp Copy link lighning bolt icon An icon in the&hellip;\n","protected":false},"author":2,"featured_media":182397,"comment_status":"","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[20],"tags":[345,343,344,85,46,125],"class_list":{"0":"post-182396","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-artificial-intelligence","8":"tag-ai","9":"tag-artificial-intelligence","10":"tag-artificialintelligence","11":"tag-il","12":"tag-israel","13":"tag-technology"},"_links":{"self":[{"href":"https:\/\/www.newsbeep.com\/il\/wp-json\/wp\/v2\/posts\/182396","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.newsbeep.com\/il\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.newsbeep.com\/il\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/il\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/il\/wp-json\/wp\/v2\/comments?post=182396"}],"version-history":[{"count":0,"href":"https:\/\/www.newsbeep.com\/il\/wp-json\/wp\/v2\/posts\/182396\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/il\/wp-json\/wp\/v2\/media\/182397"}],"wp:attachment":[{"href":"https:\/\/www.newsbeep.com\/il\/wp-json\/wp\/v2\/media?parent=182396"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.newsbeep.com\/il\/wp-json\/wp\/v2\/categories?post=182396"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.newsbeep.com\/il\/wp-json\/wp\/v2\/tags?post=182396"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}