{"id":496452,"date":"2026-02-24T14:27:13","date_gmt":"2026-02-24T14:27:13","guid":{"rendered":"https:\/\/www.newsbeep.com\/ca\/496452\/"},"modified":"2026-02-24T14:27:13","modified_gmt":"2026-02-24T14:27:13","slug":"guide-labs-debuts-a-new-kind-of-interpretable-llm","status":"publish","type":"post","link":"https:\/\/www.newsbeep.com\/ca\/496452\/","title":{"rendered":"Guide Labs debuts a new kind of interpretable LLM"},"content":{"rendered":"<p id=\"speakable-summary\" class=\"wp-block-paragraph\">The challenge of wrangling a deep learning model is often understanding why it does what it does: Whether it\u2019s xAI\u2019s repeated struggle sessions to fine-tune Grok\u2019s odd politics, ChatGPT\u2019s struggles with sycophancy, or run-of-the-mill hallucinations, plumbing through a neural network with billions of parameters isn\u2019t easy.<\/p>\n<p class=\"wp-block-paragraph\">Guide Labs, a San Francisco startup founded by CEO Julius Adebayo and chief science officer Aya Abdelsalam Ismail, is offering an answer to that problem today. On Monday, the company open sourced an 8-billion-parameter LLM, <a href=\"https:\/\/github.com\/guidelabs\/steerling\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">Steerling-8B<\/a>, trained with a new architecture designed to make its actions easily interpretable: Every token produced by the model can be traced back to its origins in the LLM\u2019s training data.<\/p>\n<p class=\"wp-block-paragraph\">That can be as simple as determining the reference materials for facts cited by the model, or as complex as understanding the model\u2019s understanding of humor or gender.<\/p>\n<p class=\"wp-block-paragraph\">\u201cIf I have a trillion ways to encode gender, and I encode it in 1 billion of the 1 trillion things that I have, you have to make sure you find all those 1 billion things that I\u2019ve encoded, and then you have to be able to reliably turn that on, turn them off,\u201d Adebayo told TechCrunch. \u201cYou can do it with current models, but it\u2019s very fragile\u00a0\u2026 It\u2019s sort of one of the holy grail questions.\u201d<\/p>\n<p class=\"wp-block-paragraph\">Adebayo began this work while earning his PhD at MIT, co-authoring a widely cited <a href=\"https:\/\/arxiv.org\/abs\/1810.03292\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">2018 paper<\/a> that showed existing methods of understanding deep learning models were not reliable. That work ultimately led to the creation of a new way of building LLMs: Developers insert a concept layer in the model that buckets data into traceable categories. This requires more up-front data annotation, but by using other AI models to help, they were able to train this model as their largest proof of concept yet.<\/p>\n<p class=\"wp-block-paragraph\">\u201cThe kind of interpretability people do is\u00a0\u2026 neuroscience on a model, and we flip that,\u201d Adebayo said. \u201cWhat we do is actually engineer the model from the ground up so that you don\u2019t need to do neuroscience.\u201d<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" height=\"399\" width=\"680\" src=\"https:\/\/www.newsbeep.com\/ca\/wp-content\/uploads\/2026\/02\/Guide-Labs-Architecture.png\" alt=\"\" class=\"wp-image-3095647\"  \/>Image Credits:Guide Labs<\/p>\n<p class=\"wp-block-paragraph\">One concern with this approach is that it might eliminate some of the emergent behaviors that make LLMs so intriguing: Their ability to generalize in new ways about things they haven\u2019t been trained on yet. Adebayo says that still happens in his company\u2019s model: His team tracks what they call \u201cdiscovered concepts\u201d that the model discovered on its own, like quantum computing.<\/p>\n<p>Techcrunch event<\/p>\n<p>\n\t\t\t\t\t\t\t\t\tBoston, MA<br \/>\n\t\t\t\t\t\t\t\t\t\t\t\t\t|<br \/>\n\t\t\t\t\t\t\t\t\t\t\t\t\tJune 9, 2026\n\t\t\t\t\t\t\t<\/p>\n<p class=\"wp-block-paragraph\">Adebayo argues this interpretable architecture will be something everyone needs. For consumer-facing LLMs, these techniques should allow model builders to do things like block the use of copyrighted materials, or better control outputs around subjects like violence or drug abuse. Regulated industries will require more controllable LLMs \u2014 for example, in finance \u2014 where a model evaluating loan applicants needs to consider things like financial records but not race. There\u2019s also a need for interpretability in scientific work, another area where Guide Labs has developed technology. Protein folding has been a big success for deep learning models, but scientists need more insight into why their software figured out promising combinations.<\/p>\n<p class=\"wp-block-paragraph\">\u201cThis model demonstrates that training interpretable models is no longer a sort of science; it\u2019s now an engineering problem,\u201d Adebayo said. \u201cWe figured out the science and we can scale them, and there is no reason why this kind of model wouldn\u2019t match the performance of the frontier level models,\u201d which have many more parameters.<\/p>\n<p class=\"wp-block-paragraph\">Guide Labs says that Steerling-8B can achieve 90% of the capability of existing models, but uses less training data, thanks to its novel architecture. The next step for the company, which emerged from Y Combinator and raised a $9 million seed round from Initialized Capital in November 2024, is to build a larger model and begin offering API and agentic access to users.<\/p>\n<p class=\"wp-block-paragraph\">\u201cThe way we\u2019re currently training models is super primitive, and so democratizing inherent interpretability is actually going to be a long-term good thing for our role within the human race,\u201d Adebayo told TechCrunch. \u201cAs we\u2019re going after these models that are going to be super intelligent, you don\u2019t want something to be making decisions on your behalf that\u2019s sort of mysterious to you.\u201d<\/p>\n","protected":false},"excerpt":{"rendered":"The challenge of wrangling a deep learning model is often understanding why it does what it does: Whether&hellip;\n","protected":false},"author":2,"featured_media":496453,"comment_status":"","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[20],"tags":[62,276,277,49,48,282,199038,66,61],"class_list":{"0":"post-496452","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-artificial-intelligence","8":"tag-ai","9":"tag-artificial-intelligence","10":"tag-artificialintelligence","11":"tag-ca","12":"tag-canada","13":"tag-exclusive","14":"tag-guide-labs","15":"tag-science","16":"tag-technology"},"_links":{"self":[{"href":"https:\/\/www.newsbeep.com\/ca\/wp-json\/wp\/v2\/posts\/496452","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.newsbeep.com\/ca\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.newsbeep.com\/ca\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/ca\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/ca\/wp-json\/wp\/v2\/comments?post=496452"}],"version-history":[{"count":0,"href":"https:\/\/www.newsbeep.com\/ca\/wp-json\/wp\/v2\/posts\/496452\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/ca\/wp-json\/wp\/v2\/media\/496453"}],"wp:attachment":[{"href":"https:\/\/www.newsbeep.com\/ca\/wp-json\/wp\/v2\/media?parent=496452"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.newsbeep.com\/ca\/wp-json\/wp\/v2\/categories?post=496452"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.newsbeep.com\/ca\/wp-json\/wp\/v2\/tags?post=496452"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}