{"id":189055,"date":"2025-09-29T04:28:13","date_gmt":"2025-09-29T04:28:13","guid":{"rendered":"https:\/\/www.newsbeep.com\/us\/189055\/"},"modified":"2025-09-29T04:28:13","modified_gmt":"2025-09-29T04:28:13","slug":"researchers-decode-how-protein-language-models-think-making-ai-more-transparent","status":"publish","type":"post","link":"https:\/\/www.newsbeep.com\/us\/189055\/","title":{"rendered":"Researchers Decode How Protein Language Models Think, Making AI More Transparent"},"content":{"rendered":"<p id=\"isPasted\">For large language models (LLMs) like ChatGPT, accuracy often means complexity. To be able to make good predictions, ChatGPT must deeply understand the concepts and features that are associated with each word\u2014but how it gets to this point is typically a black box.  <\/p>\n<p>Similarly, protein language models (PLMs), which are LLMs used by protein scientists, are dense with information. Scientists often have a hard time understanding how these models solve problems, and as a result, they struggle to judge the reliability of the models\u2019 predictions. <\/p>\n<p><img decoding=\"async\" data-id=\"51057\" src=\"https:\/\/www.newsbeep.com\/us\/wp-content\/uploads\/2025\/09\/bonnie-berger-headshot-m.jpg\" alt=\"Bonnie Berger poses with her left hand propped under her chin and her right hand on a countertop. She\u2019s wearing a red and white sleeveless shirt with floral patterns.\" title=\"Bonnie Berger poses with her left hand propped under her chin and her right hand on a countertop. She\u2019s wearing a red and white sleeveless shirt with floral patterns.\" loading=\"lazy\" width=\"450\" height=\"663\" class=\"fr-fil fr-dib\"\/><\/p>\n<p class=\"fr-caption\">Bonnie Berger is a mathematician and computer scientist at the Massachusetts Institute of Technology. She\u2019s interested in using large language models to study proteins.<\/p>\n<p class=\"fr-reference\">Bonnie Berger<\/p>\n<p>\u201cThese models give you an answer, but we have no idea why they give you that answer,\u201d said <a href=\"https:\/\/math.mit.edu\/directory\/profile.html?pid=20\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">Bonnie Berger<\/a>, a mathematician and computer scientist at the Massachusetts Institute of Technology. Because it\u2019s difficult to assess the models\u2019 performance, \u201cpeople either put zero trust or all their trust in these protein language models,\u201d Berger said. She believes that one way to calm these qualms is to try to understand how PLMs think.<\/p>\n<p>Continue reading below&#8230;<\/p>\n<p>Recently, Berger\u2019s team applied a tool called sparse autoencoders, which are often used to make LLMs more interpretable, to PLMs.1 By making the dense information within PLMs sparser, the researchers could uncover information about a protein\u2019s family and its functions from a single sequence of amino acids. This work, published in the Proceedings of the National Academy of Sciences, may help scientists better understand how PLMs come to certain conclusions and increase researchers\u2019 trust in them.<\/p>\n<p><img decoding=\"async\" data-id=\"51058\" src=\"https:\/\/www.newsbeep.com\/us\/wp-content\/uploads\/2025\/09\/james-fraser-headshot-m.jpg\" alt=\"James Fraser poses in front of a blurred background of a laboratory bench. He\u2019s wearing a dark grey tee underneath a blue\/green checkered shirt.\" title=\"James Fraser poses in front of a blurred background of a laboratory bench. He\u2019s wearing a dark grey tee underneath a blue\/green checkered shirt.\" loading=\"lazy\" width=\"450\" height=\"450\" class=\"fr-fil fr-dib\"\/><\/p>\n<p class=\"fr-caption\">James Fraser is a biophysicist at the University of California, San Francisco who uses computational approaches to study protein conformation. He was not involved in the study. <\/p>\n<p class=\"fr-reference\">James Fraser<\/p>\n<p id=\"isPasted\">\u201c[This study] tells us a lot about what the models are picking up on,\u201d said <a href=\"https:\/\/msg.ucsf.edu\/content\/james-fraser-phd\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">James Fraser<\/a>, a biophysicist at the University of California, San Francisco who was not involved in the study. \u201cIt\u2019s certainly cool to get this kind of look under the hood of what was previously kind of a black box.\u201d <\/p>\n<p>Berger thought that part of people\u2019s excitement about PLMs had come from <a href=\"https:\/\/www.the-scientist.com\/nobel-prize-in-chemistry-for-work-on-proteins-72232\" target=\"_self\" rel=\"noreferrer noopener nofollow\">AlphaFold<\/a>\u2019s success. But while both PLMs and AlphaFold are AI tools, they work quite differently. AlphaFold predicts protein structure by aligning a lot of protein sequences. Models like these typically boast a high level of accuracy, but researchers must spend considerable time and resources to train them.<\/p>\n<p id=\"isPasted\">On the other hand, PLMs are designed to predict features of a protein, like how it interacts with other proteins, from a single sequence. PLMs learn the relationship between protein sequence and function instead of the relationship between different protein sequences. While they learn much faster, they may not be as accurate. <\/p>\n<p>\u201cWhen large language models that only take a single sequence came along, people thought, \u2018We should believe this too,\u2019\u201d Berger said. \u201cBut now, they\u2019re at the stage of, \u2018Oh my gosh, they\u2019re not always right.\u2019\u201d To know when PLMs are right or wrong, researchers first need to understand them. <\/p>\n<p>PLMs are highly complex. Each neuron in the neural network\u2014AI\u2019s equivalent of a brain\u2014is assigned to more than one discrete unit of information, called tokens. Conversely, multiple neurons often process each token.<\/p>\n<p><img decoding=\"async\" data-id=\"51059\" src=\"https:\/\/www.newsbeep.com\/us\/wp-content\/uploads\/2025\/09\/onkar-gujral-headshot-m.jpg\" alt=\"Onkar Gujral poses in a park, in front of a building called \u201cTHE PARK\u201d. He\u2019s wearing glasses, a blue T-shirt underneath a black jacket, and a dark blue headwrap.\" title=\"Onkar Gujral poses in a park, in front of a building called \u201cTHE PARK\u201d. He\u2019s wearing glasses, a blue T-shirt underneath a black jacket, and a dark blue headwrap.\" loading=\"lazy\" width=\"450\" height=\"479\" class=\"fr-fil fr-dib\"\/><\/p>\n<p class=\"fr-caption\">Onkar Gujral is a fifth-year mathematics PhD student at the Massachusetts Institute of Technology, advised by Bonnie Berger. He was the lead author of the study. <\/p>\n<p class=\"fr-reference\">Onkar Gujral<\/p>\n<p>\u201cYou store information in clusters of neurons, so the information is very tightly compressed,\u201d said <a href=\"https:\/\/math.mit.edu\/directory\/profile.html?pid=2306\" target=\"_blank\" rel=\"noreferrer noopener nofollow\" id=\"isPasted\">Onkar Gujral<\/a>, a graduate student in Berger\u2019s group who led the study. \u201cThink of it as entangled information, and we need to find a way to disentangle this information.\u201d<\/p>\n<p>Continue reading below&#8230;<\/p>\n<p id=\"isPasted\">This is where the sparse autoencoders come in. They allow information stored in the neural network to spread out among more neurons. With less tightly packed information, researchers can more easily figure out which neuron in the network associates with which feature of a protein, much like how neuroscientists try to assign specific functions to brain regions.  <\/p>\n<p>Next, the team fed the processed information to Claude, an LLM, which added annotations such as the protein\u2019s name, family, and related pathways. \u201cBy disentangling the information, we can now interpret what\u2019s going on inside the protein language model,\u201d Gujral said.<\/p>\n<p id=\"isPasted\">Fraser said, \u201cThis paper is among the first in a group of similar papers that came out roughly around the same time,\u201d citing several preprint publications by other groups of researchers that also used sparse autoencoders to better understand PLMs.2-4  <\/p>\n<p>But Berger\u2019s team didn\u2019t think that disentangling information was enough. They also wanted to follow the models&#8217; train of thought. To do this, the researchers used transcoders, a variant of sparse autoencoders that track how information changes from one \u201clayer\u201d of the neural network to another. \u201cIt might give you the model\u2019s logic of thinking\u2014its change of thoughts\u2014which can give you more confidence in its output,\u201d Berger said. <\/p>\n<p id=\"isPasted\">Fraser thought that the quest to make PLMs more interpretable is a \u201creally cool frontier,\u201d but he still questions its practicality. \u201cWe\u2019ve got <a href=\"https:\/\/www.the-scientist.com\/artificial-intelligence-in-biology-from-artificial-neural-networks-to-alphafold-72435\" target=\"_self\" rel=\"noreferrer noopener nofollow\">AI<\/a> interpreting AI. Then we need more AI to interpret that result\u2014we&#8217;re going down a rabbit hole,\u201d he said. \u201cIt\u2019s very, very hard to directly figure out what features the autoencoders are actually revealing.\u201d <\/p>\n<p>Berger, on the other hand, is confident that she\u2019ll be able to put her tool to use. Her team previously developed a PLM to optimize antibody design for therapeutics and another to predict the interaction between drugs and their targets.5,6 She hopes to use sparse autoencoders and transcoders to better understand these models. <\/p>\n","protected":false},"excerpt":{"rendered":"For large language models (LLMs) like ChatGPT, accuracy often means complexity. To be able to make good predictions,&hellip;\n","protected":false},"author":2,"featured_media":189056,"comment_status":"","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[45],"tags":[182,181,507,74],"class_list":{"0":"post-189055","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-artificial-intelligence","8":"tag-ai","9":"tag-artificial-intelligence","10":"tag-artificialintelligence","11":"tag-technology"},"_links":{"self":[{"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/posts\/189055","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/comments?post=189055"}],"version-history":[{"count":0,"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/posts\/189055\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/media\/189056"}],"wp:attachment":[{"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/media?parent=189055"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/categories?post=189055"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/tags?post=189055"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}