{"id":481827,"date":"2026-03-18T07:53:11","date_gmt":"2026-03-18T07:53:11","guid":{"rendered":"https:\/\/www.newsbeep.com\/uk\/481827\/"},"modified":"2026-03-18T07:53:11","modified_gmt":"2026-03-18T07:53:11","slug":"mistrals-new-agent-proofs-your-code-on-the-cheap-the-register","status":"publish","type":"post","link":"https:\/\/www.newsbeep.com\/uk\/481827\/","title":{"rendered":"Mistral&#8217;s new agent proofs your code on the cheap \u2022 The Register"},"content":{"rendered":"<p>Your AI may need AI to oversee its work. Gallic AI biz Mistral is leaning into making AI code generation more reliable with Leanstral, a coding agent for proofs constructed using the open source <a href=\"https:\/\/lean-lang.org\" rel=\"nofollow noopener\" target=\"_blank\">Lean<\/a> programming language.<\/p>\n<p>Formal code verification, Mistral <a href=\"https:\/\/mistral.ai\/news\/leanstral\" rel=\"nofollow noopener\" target=\"_blank\">argues<\/a>, reduces the need for human code review, a potentially time-consuming process. Proofs, tests, linting, and specifications can help ground AI code agents in reality so that they produce better output.<\/p>\n<p>Leanstral has been released with open weights (Apache 2.0) as an agent mode within <a href=\"https:\/\/mistral.ai\/products\/vibe\" rel=\"nofollow noopener\" target=\"_blank\">Mistral Vibe<\/a>, and via a free API endpoint. It is accompanied by results from an as-yet-unreleased benchmark test called FLTEval, designed to evaluate how AI models handle engineering proofs.<\/p>\n<p>According to Mistral, Leanstral-120B-A6B outperforms larger (more parameters) open source rivals like GLM5-744B-A40B, Kimi-K2.5-1T-32B, and Qwen3.5-397B-A17B on FLTEval.<\/p>\n<p>But perhaps more noteworthy is Leanstral&#8217;s effect on one&#8217;s bank account.<\/p>\n<p>&#8220;Leanstral serves as a high-value alternative to the Claude suite, offering competitive performance at a fraction of the price: Leanstral pass@2 reaches a score of 26.3, beating Sonnet by 2.6 points, while costing only $36 to run, compared to Sonnet&#8217;s $549,&#8221; the AI biz claims. &#8220;At pass@16, Leanstral reaches a score of 31.9, comfortably beating Sonnet by 8 points.&#8221;<\/p>\n<p>As for Claude Opus 4.6, Anthropic&#8217;s premium model at the moment, it scores higher on FLTEeval than Leanstral (39.6 compared to 31.9 for pass@16). But Opus will cost $1,650, compared to $290 Leanstral at 16 passes, or $18 for a single pass at a score of 21.9.<\/p>\n<p>As proof of Leanstral&#8217;s adept handling of test-driven development, Mistral had the coding agent tackle <a href=\"https:\/\/proofassistants.stackexchange.com\/questions\/6471\/a-strange-issue-with-a-type-alias-in-lean4\" rel=\"nofollow noopener\" target=\"_blank\">an actual question<\/a> from the Proof Assistant Stack Exchange about a bug in some Lean 4 code.<\/p>\n<p>The company reports that Leanstral successfully built the test code to reproduce the failure and then correctly spotted and fixed the flaw.<\/p>\n<p>Mistral also released <a href=\"https:\/\/mistral.ai\/news\/mistral-small-4\" rel=\"nofollow noopener\" target=\"_blank\">Mistral Small 4<\/a>, designed as an all-in-one model that can handle reasoning, coding, and instruct\/chat tasks, so you don&#8217;t have to switch between specialized models. \u00ae<\/p>\n","protected":false},"excerpt":{"rendered":"Your AI may need AI to oversee its work. Gallic AI biz Mistral is leaning into making AI&hellip;\n","protected":false},"author":2,"featured_media":481828,"comment_status":"","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[20],"tags":[554,733,4308,86,56,54,55],"class_list":{"0":"post-481827","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-artificial-intelligence","8":"tag-ai","9":"tag-artificial-intelligence","10":"tag-artificialintelligence","11":"tag-technology","12":"tag-uk","13":"tag-united-kingdom","14":"tag-unitedkingdom"},"_links":{"self":[{"href":"https:\/\/www.newsbeep.com\/uk\/wp-json\/wp\/v2\/posts\/481827","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.newsbeep.com\/uk\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.newsbeep.com\/uk\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/uk\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/uk\/wp-json\/wp\/v2\/comments?post=481827"}],"version-history":[{"count":0,"href":"https:\/\/www.newsbeep.com\/uk\/wp-json\/wp\/v2\/posts\/481827\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/uk\/wp-json\/wp\/v2\/media\/481828"}],"wp:attachment":[{"href":"https:\/\/www.newsbeep.com\/uk\/wp-json\/wp\/v2\/media?parent=481827"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.newsbeep.com\/uk\/wp-json\/wp\/v2\/categories?post=481827"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.newsbeep.com\/uk\/wp-json\/wp\/v2\/tags?post=481827"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}