{"id":167155,"date":"2026-02-06T20:11:34","date_gmt":"2026-02-06T20:11:34","guid":{"rendered":"https:\/\/www.newsbeep.com\/us-ca\/167155\/"},"modified":"2026-02-06T20:11:34","modified_gmt":"2026-02-06T20:11:34","slug":"berkeley-talks-an-evolutionary-biologist-makes-the-case-for-pausing-ai","status":"publish","type":"post","link":"https:\/\/www.newsbeep.com\/us-ca\/167155\/","title":{"rendered":"Berkeley Talks: An evolutionary biologist makes the case for pausing AI"},"content":{"rendered":"<p>(<a href=\"https:\/\/freemusicarchive.org\/music\/holiznacc0\/be-happy-with-who-you-are\/no-one-is-perfect\/\" rel=\"nofollow noopener\" target=\"_blank\">Music: \u201cNo One Is Perfect\u201d by HoliznaCC0<\/a>)<\/p>\n<p>Anne Brice (intro): This is Berkeley Talks, a UC Berkeley News podcast from the Office of Communications and Public Affairs that features lectures and conversations at Berkeley. You can follow Berkeley Talks wherever you listen to your podcasts. We\u2019re also on YouTube @BerkeleyNews. New episodes come out every other Friday. You can find all of our podcast episodes, with transcripts and photos, on UC Berkeley News at<a href=\"http:\/\/news.berkeley.edu\/podcasts\" rel=\"nofollow noopener\" target=\"_blank\"> news.berkeley.edu\/podcasts<\/a>.<\/p>\n<p>(Music fades out)<\/p>\n<p>Holly Elmore: I\u2019m here, and I\u2019m going to start with a little disambiguation. So I\u2019m going to talk about the deep worldview behind PauseAI and the Theory of Change behind PauseAI. So I\u2019m Holly Elmore, and I have a Ph.D. in evolutionary biology, so that\u2019s my intellectual background.<\/p>\n<p>I also want to mention that I didn\u2019t go into academia after I graduated. I worked at a think tank for three years on the topic of wild animal welfare, how do wild animals feel, maybe could we make them feel better? And then it was from there that I left that job to start PauseAI US. So that\u2019s the legal entity that I am the executive director of.<\/p>\n<p>I\u2019m also a co-founder of the PauseAI Movement, and the big other co-founder is Joep Meindertsma, who\u2019s in the Netherlands, and he runs the organization now called PauseAI Global, which runs more of the digital resources. And the big Discord is the PauseAI Global Discord.<\/p>\n<p>And was there something else I wanted to say about that? I had a lot of caveats to front load. Right, right. So just bear in mind that there\u2019s PauseAI, the movement, which spiritually includes everyone who wants to pause AI. Then there\u2019s PauseAI, the legal entities. And in PauseAI US, there\u2019s a couple levels of membership.<\/p>\n<p>So there\u2019s a volunteer agreement you have to sign before you do a protest with us or before you run a local org or participate in our events. So that\u2019s a level. I\u2019d call those volunteers. And then there\u2019s the paid staff of the org, which is quite small. It\u2019s me and two other people right now. So there might be a number of times when it would be tempting to think of PauseAI as one thing. It\u2019s probably not that big a deal for this talk, but I just want to let you know.<\/p>\n<p>Should I be able to increment this? OK. So to be in PauseAI, what it means to be in PauseAI are only these stipulations, which is we don\u2019t know what we\u2019re doing with frontier AI, and this could be catastrophically dangerous. Also, even if it\u2019s not exactly catastrophically dangerous, the people don\u2019t want their lives radically disrupted in lots of ways, so the consent of the people matters on this.<\/p>\n<p>Two, the default should be pausing, and we should have the possibility of never unpausing from AI development, frontier AI development.<\/p>\n<p>And three, it\u2019s feasible to pause through international agreement. So this is all you have to agree with to join PauseAI, if that feels right to you.<\/p>\n<p>But how we get there is, I\u2019m going to, again, make some fine distinctions at the beginning. So this is like, I don\u2019t know, for people who aren\u2019t familiar with Theory of Change language, there\u2019s a difference between the vision, which is the world that we want and your mission. And then I\u2019m listing two missions here, which is PauseAI the movement, and then this is PauseAI US\u2019s mission.<\/p>\n<p>So the vision of PauseAI is a world where there\u2019s been a global treaty to pause frontier AI and society is thriving. But we don\u2019t think that our actions alone are going to create that world. This is just the end state that we want.<\/p>\n<p>PauseAI, the movement\u2019s mission is grassroots activities and education to move the Overton Window to support this treaty.<\/p>\n<p>And then PauseAI US is taking place in the U.S. So what our mission is to influence U.S. politicians to actions supporting a treaty and domestic safety measures, so better than nothing safety stuff just within the U.S., via grassroots and education.<\/p>\n<p>So why is this what we want? Why is that the world that we want without dangerous frontier AI with the treaty, and why are we pursuing it through grassroots political actions and education? Now I\u2019m going to get into worldview section. So one big thing, and this is I should also say, PauseAI, the reason I broke out those three things that just like that\u2019s all it is with PauseAI, is that lots of people could want to pause AI for lots of reasons. And as long as you agree to not do violence and not break the law, whatever reason you have for wanting to PauseAI is good and you\u2019re in the team. So it\u2019s a big tent movement.<\/p>\n<p>So to fill in why you would want to do things this way, I\u2019m going to mostly draw from my own worldview experience and why I made the org this way. But it\u2019s also something that\u2019s pretty in common among people who think deeply about this. And especially since you guys are more familiar with the AI risk thinking and the various intellectual strains behind it, it\u2019s what distinguishes us from the other groups in AI safety.<\/p>\n<p>So our world is fragile and species go extinct as a rule. This is something that I definitely, definitely understood after doing a Ph.D. in biology. My goodness. So I just picked an image to show this. This is extinction since 1500 due to us, basically. So this is the rate of extinction of species in these different groups since 1500. And then even this background rate here, even though it seems pretty low, it\u2019s enough to ensure that more than 99% of all species that ever existed are extinct now.<\/p>\n<p>So this is the idea that we\u2019re imperturbable, of course, nothing can be so bad as to affect the world that it ruins the things we can count on, like humans are here, is a pretty deep part of the worldview. And a lot of people who are into ex risk have the fragile worldview. That\u2019s Nick Bostrom. You may recognize his fragile world hypothesis. That\u2019s one thing he lists as a crux in this.<\/p>\n<p>And then I found this cool little demo of competitive exclusion, just one of the many things that \u2026 So the idea of competitive exclusion is that if two species are trying to occupy exactly the same niche, the more fit one will outcompete the other, and it drives them to extinction. So there\u2019s just so many ways other than just habitat destruction or anything that it\u2019s very normal for a species to not last forever. It\u2019s hard. The unusual thing is a species persisting.<\/p>\n<p>So that leads into this idea that there are lots of equilibria that we depend on and that a more powerful intelligence could disrupt and we wouldn\u2019t even know. Like these animals that go extinct in these categories because of mostly human development, so like habitat loss and then affecting the climate. They didn\u2019t know and they\u2019re not trying to maintain certain equilibria. They just live in a world where there used to be something dependable and now they\u2019re not. So environmental equilibria, there are things that we just did not know we were doing even before we\u2019ve done it already.<\/p>\n<p>There might be social world equilibria. So perhaps it feels like you\u2019re getting companionship from a chatbot, but you\u2019re not missing some vital nutrient we don\u2019t know about. At first, people thought it was cool to eat radium. Have you guys ever seen the Radium Girls? They thought they were reasoning, it\u2019s like the sun, it\u2019s like power, it\u2019s good, it\u2019s bright. And they thought it was really neat. The Radium Girls who lost their jaws eventually, it was considered a benefit of that job that they got to lick the radium paintbrush to sharpen it because they got to eat the radium. It wasn\u2019t even a side effect. It was the point.<\/p>\n<p>So we don\u2019t know things about how the world works because our body assimilates, it thinks it\u2019s calcium, it assimilates radium. Then it decays and your bones decay. There\u2019s so much like this. And definitely, as a scientist, deep appreciation for how much you don\u2019t, especially a biologist, for how much you don\u2019t realize what is in the empirical world.<\/p>\n<p>And then societal equilibrium, I think a great example of this is the institution of jobs. Is it possible to have a better equilibrium than working to live? I think so, probably. But if you blow up the institution of jobs without having an idea of how to replace it, it could be pretty horrible. So right now we have actually a pretty solid people need each other based on their abilities, something like being themselves. If that went away, that could be very bad.<\/p>\n<p>So overall, it\u2019s not that better equilibria are impossible. It\u2019s just we need to know how to reach them, and it could be very, very, very hard. So the way that we are doing development now is clearing a minefield one and time by taking little steps at a time. When you hit a mine, what do you expect to happen? You blow up. That\u2019s how you learn, and you can\u2019t afford that. I\u2019m saying we can\u2019t afford it with technology as powerful as AI.<\/p>\n<p>So the deadass expectation of many people in AI safety for many years has been that when we got to this point, the AI, once it was aligned, would figure out the answers for us. This is not good enough. This is not going to happen because we need to be the source of truth about what is good for us. You might think the AI is aligned and not know. Really there\u2019s only one source of truth and that\u2019s being the entity whose experience you\u2019re trying to protect. And as I say, in general, experiments are costly. And so experiments are costly in the sense that experimenting with where the mines are in minefield on foot is costly.<\/p>\n<p>But the probability of accidents, even accidents that aren\u2019t big enough to destroy the world, but the probability of lots of accidents that will probably happen with increasing AI capabilities up until that level of power are more likely the more that we increase AI capabilities, and each one reduces our capacity to do better and respond better to accidents in the future. And then, of course, with capabilities high enough, one day the accidents could one shot.<\/p>\n<p>So the scale of the danger really could cripple civilization or cause extinction, and the possibility of this alone is reason enough to pursue pausing frontier AI development. So this is a difference frequently between us and other people in the ex risk world. Ex risk doesn\u2019t have to be these crazy high numbers. There was a vogue a couple years ago for in the 90s, everybody had P(doom) in the 90s, and it does not have to be that high to justify not rolling the dice for extinction. Just don\u2019t roll them. 5% chance of extinction is extremely high. It\u2019s really intolerably high.<\/p>\n<p>And also there\u2019s this idea of burden of proof issue where people will come at you and ask about, \u201cWell, what exactly is going to happen? I don\u2019t get it.\u201d And if they\u2019re just asking for how could it happen so I can understand it all, good. Give them an example. Otherwise, there\u2019s this sense that, well, if you don\u2019t know exactly how we\u2019re going to die down to shot-for-shot, then why should we listen to you? I think I need to know how we\u2019re going to live shot-for-shot from you before we make this technology. And PauseAI, as we\u2019ll talk about in the Theory of Change section, is about reestablishing that proper burden of proof. So we already know we\u2019re vulnerable because the world is fragile, because we know there\u2019s lots of equilibrium we don\u2019t know about because we know that developing AI is bringing us into higher territory of capability and intelligence.<\/p>\n<p>And then finally, the people of the world do not consent to have frontier AI forced on them. So this is just one of dozens of polls I could have selected, but this is showing 80% agreement this year in April and May that the people\u2019s priorities are safety, even if it means going slower to develop AI.<\/p>\n<p>So our worldview includes, and this is sometimes whether ex risk people, a little bit of matter of debate, whether it matters that people want this technology. I would love to talk about this in question time if you guys are interested, but I didn\u2019t put it in this talk. I think there\u2019s a lot of worldview stuff that sees it as you can\u2019t tell somebody not to do something. Unless you can prove that it\u2019s the same as violence or something, you can\u2019t tell somebody not to do something. And they also tend to believe that everybody else thinks that. But actually the people don\u2019t generally have a problem with saying, \u201cDon\u2019t do things that could hurt me.\u201d So PauseAI is trying to bring the worldview is that people don\u2019t want this and they have a right to advocate for what they don\u2019t want.<\/p>\n<p>Oops. I think that should say worldview, sorry.<\/p>\n<p>So what about alignment? The PauseAI position is different from my personal position. The PauseAI position is agnostic. It could happen or it couldn\u2019t. We just know that we need the time and the governance to be able to pursue and see if alignment is possible. And having a pause in place protects us either way because if it\u2019s not, we don\u2019t unpause. Good. Phew. If we have the pause in place, then we get the quality alignment, the time to get the proper kind of research and development, then good.<\/p>\n<p>My personal view, and I anticipate this could be a big question time thing, is that the idea of alignment of AI is philosophically confused. There\u2019s no state of being aligned that isn\u2019t contingent on external factors constantly to continue to persist.<\/p>\n<p>So yeah, how much? I really tried hard to constrain what I talk about in this part of the talk about this because I could say a lot. And then I further think that alignment between entities of vastly different capabilities may be impossible. So it may be that anything that seems like alignment that we\u2019re familiar with today is contingent on roughly equal levels of possibilities. I have never done rigorous work to try to prove this or go after it, but this is my worldview, and I\u2019d love to talk about it if people are interested in this part of it.<\/p>\n<p>And a reason that I think this is because of my background in evolutionary theory, a lot of it. So I did a great deal of thinking about genetic misalignment, which is called meiotic drive, the form of genetic misalignment called meiotic drive. A very, very quick background on meiosis. I would also love to get into this. I do have, it\u2019s on the next page, I have something you can follow to find a long blog post I did on making this comparison.<\/p>\n<p>But genetic meiosis is the process that governs our genes and makes the likelihood of getting to the next generation a fair process. And long story short, it allows natural selection to work because it means that genes have to work together to make an organism, and that\u2019s the only way to reproduce because genes could actually reproduce themselves through other means. And if they could do that successfully, then it would harm the integration of the organism. And in the world \u2026 Oh, you can\u2019t see this at all. OK, sorry.<\/p>\n<p>In just the world of organisms, so this axis is cooperation, more cooperation to less cooperation. And this is more conflict to less conflict. And this is about how well-associated the pieces of a whole are into \u2026 So this paper, I love this paper. It\u2019s Queller and Strassman 2009. It\u2019s something like toward a view of organismality. I would highly recommend this paper. But the idea is that you get different levels of cooperation and different levels of integration.<\/p>\n<p>Our cells, pretty integrated. But if you look at other animals, you will see \u2026 So what\u2019s a great example of this? This is maybe getting a little in the weeds. Sorry, I don\u2019t want to get too confusing. So we think usually of our genes as their job is to be in the genome making the organism. But actually, no, they\u2019re one level of organization of this whole. And our genes are fairly tight, but there are some configurations of genes, like alliances of genes, which are not very tight. Amoeba can come together and work together sometimes, and then they separate. Our cells can\u2019t separate and be their own thing.<\/p>\n<p>So genes can also do this. And this is a very mind trippy thing to learn, and it\u2019s tough. I\u2019d love to say more if people are confused, but I also don\u2019t want to confuse you. The upshot is that when we think of alignment, we\u2019re thinking about unitariness that probably doesn\u2019t really exist, even in things that we think are aligned. Like a person with their own interests, that sort of thing.<\/p>\n<p>And I do also want to say from the extinction point, this is a tree that this is one of 28 individuals left in this species, and it\u2019s rare to catch an extinction that you know is caused by genetic misalignment. But this species is going to go extinct when these individuals die, basically, because it does this thing where there\u2019s conflict between parts of the partitions of the genome. So the paternal genome kicks out the maternal genome early in development. And because of that, it\u2019s accumulating mutations and stuff, and that\u2019s going to cause this species to go extinct.<\/p>\n<p>Just to give you a flavor of what can happen even at the gene level and how much our ideas about alignment that are being applied to alignment, like humans or a human with AI, are very simplistic and very broad brush about who are individuals and things like that, and topics that I think are enough to make it so that any seemingly adequate model breaks down.<\/p>\n<p>So this is on my Substack. I have a blog post to compare meiotic drive to gradient hacking. I don\u2019t know who\u2019s all into all of those pieces, but I did write this. And then the point I want to make here quickly is that evolutionary theory gives uncommon insights on this sort of stuff, like thinking about misalignment and technical safety. So almost always misalignment is talked about in terms of agents being misaligned. But based on meiotic drive and based on a looser understanding of what makes something an organism, how integrated it is, I think that circuits not getting with the program is a relatively neglected possibility.<\/p>\n<p>So I probably won\u2019t get into that too deep, but the circuits can \u2026 The simplest example would be it\u2019s possible, based on all of my thinking about meiotic drive, I then came to looking at ML systems for safety reasons. And the first thing I thought is about little cabals of circuits that resist updating. So the simplest example of meiotic drive is two genes that want to, they both want to be inherited more in the next generation more than the rule, which is like they can\u2019t go over 50%.<\/p>\n<p>And one way to do it is, so in mouse sperm, there\u2019s this thing called the t-haplotype that has a poison that it makes. And then also if you have the whole T locus, you have the antidote. But if that gets broken up by recombination, then the gametes, the sperm that don\u2019t have both will die. So it can make a poison that kills everybody but the ones that have both. And then that way even though the organism is less fit because now it has less sperm overall, some of it\u2019s been killed, the relative fitness of that genotype is 100% or close to 100%, so that\u2019s rewarded by natural selection.<\/p>\n<p>So I mean, I haven\u2019t gone further into empirically looking into these things, but it seems to me very likely that there are circuits that just resist updating by means like this. Whatever is done to disturb them increases the loss, and so they just don\u2019t get updated.<\/p>\n<p>Will this matter or not? There are things about \u2026 I shouldn\u2019t get into it. You should read this blog post if you\u2019re interested in how that could possibly matter or not. But there\u2019s just a lot that\u2019s left on the table I feel like currently with what the field is. The evolutionary knowledge about the world can really help out.<\/p>\n<p>A deeper worldview thing is I believe a lot in ecological validity, and that\u2019s something that almost never comes up in ex risk discussions. So in biology, you can have a model that works and is internally consistent, and that\u2019s one good thing, but in order for people to really care, it has to also be ecologically valid. So there\u2019s many things that could exist and make sense that are not ecologically valid. It has to be true according to empirical measurements in the real world.<\/p>\n<p>I have put so many confusing terms in here. I\u2019m sorry. I\u2019m going to explain what I mean by \u2026 Most of these I made up. So deep Goodharting, everybody knows what Goodharting is, right? No, OK. So Goodharting is it\u2019s like making a measure into a target. So it\u2019s like you have a real goal, and then you\u2019re like, well, in order to approximate my goal of having a good marriage, I\u2019m going to go on one date a week. And then if it gets to the point where you\u2019re going on a date at the expense of a good marriage, you\u2019ve lost sight of what you were trying to optimize for. That\u2019s Goodharting. It can be similar to reward hacking. It\u2019s like a form of Goodharting.<\/p>\n<p>So this is probably even too confusing. I almost want to cover it up while I\u2019d say this one because I don\u2019t want to distract you guys. Sorry. So I have an appreciation that things can really seem, like the models and the abstract things we come up, with can seem true. And it\u2019s really important that things we do try to find abstract models that make sense because that\u2019s good for prediction.<\/p>\n<p>But with ourselves, we need to be careful not to do that. We are the source of truth on what actually is what we like, what is good for us, our thriving. Our experiences are the source of truth. And so if you have an idea that some have about AI that, well, the AI would know how to be better, and so I would like to be changed and be better, I think what makes you happy is probably an extremely complex utility function. And if we think we\u2019ve captured the utility function, we probably haven\u2019t.<\/p>\n<p>So proposals for alignment, generally there have been different proposals over the years. So you go back to the \u201990s and the proposals are about programming this AI correctly, an analytic symbolic AI, and giving it the right utility function, finding the Von Neumann utility function of a human and making sure that AI has the same one, and then it\u2019s like they\u2019re the same entity at that point. So it\u2019s all about finding the utility function. I\u2019m very skeptical about getting it at all in the first place, so I don\u2019t think you can give up. And then of course, in that scenario, once you give the AI the utility function and let it go, then it\u2019s in charge. It\u2019s more powerful always.<\/p>\n<p>Even more today\u2019s scenarios of alignment proposals, like scalable oversight and super alignment, they\u2019re more ecologically valid. They keep more data. They\u2019re not trying to nail down what is the utility function. They have no idea. But they\u2019re taking humans out of the loop so that the process, it\u2019s not synced up with what our utility function is. It is a more complicated \u2026 It\u2019s not trying to simplify what is the utility function, what gets rewarded, but it\u2019s still not connected to the source of truth.<\/p>\n<p>So I think the term I have put on this is ecological validity for utility functions. And then the thing I\u2019m contrasting it to I call it deep Goodharting. The idea that, well, I know what my terminal values are, so I\u2019m just going to change myself in the direction or I\u2019m going to allow myself to be changed in the direction, or I\u2019m going to make AI that is a version of me that is in that direction, I think it\u2019s very, very likely losing sight of truly what our true utility function is and the thing we\u2019re actually trying to preserve and optimize.<\/p>\n<p>So long story short on here, I don\u2019t think that through AI right now, we\u2019re very likely to get what we want out of alignment. So alignment feels like a fantasy. To me, most days I think alignment is a fantasy. The level of alignment or the way that alignment has been talked about in the past is probably a fantasy.<\/p>\n<p>But the org is officially agnostic. You can have a different opinion on this. The point is, no matter what, we should be pausing now, figuring this out during the pause and not unpausing until it\u2019s safe enough.<\/p>\n<p>OK, so now to the Theory of Change section. So what are we doing about this worldview? Why have we picked the means that we have, the mission that we have? OK, so I wish I had covered this.<\/p>\n<p>So the Overton Window, first of all, what is the Overton Window? It is the window of thinkable sentiments. And I liked this graphic. This is just the one on Wikipedia. But it\u2019s if you\u2019re smack in the middle of the Overton Window, then obvious. Of course, this is how things are. It will enforce how things are in the middle. And then as you move away, things seem less and less thinkable. Finally, when it\u2019s outside of the overtime window, it\u2019s unthinkable.<\/p>\n<p>So our belief, the PauseAI Theory of Change, is that the people actually want a pause. They don\u2019t all know it yet. They don\u2019t know enough about this issue to know that\u2019s the name of what they want. But according to polls, we see their priorities are safety. They don\u2019t want catastrophic problems. Even if they don\u2019t know about the possibility of catastrophic extras, they know that they don\u2019t want a harmful disruption to society.<\/p>\n<p>So when they understand, through education, what\u2019s going on, they\u2019re going to tell their representatives, they\u2019re going to exert their power, they\u2019re going to tell the people around them, and the Overton Window will shift, and it\u2019s going to compound. It\u2019s going to make it so that now the people who were one step over are exposed and so on and so on.<\/p>\n<p>And this is the basis of our Theory of Change is that we have this untapped will, but the way people are thinking about it, they don\u2019t know enough or there\u2019s certain emotional hangups or there\u2019s pressure, obviously from industry and motivated actors, to stop them from realizing this. But if we educate them and then we also present the pause, use techniques to push the Overton Window, then we can get what we want, which I think is \u2026 Nope. Then we can move on to the next idea, which is \u2026<\/p>\n<p>So one thing that\u2019s keeping pause outside of the center of the Overton Window, outside of becoming policy is that it\u2019s very difficult emotionally to think about a lot of the issues involved in ex risk. And I\u2019m sure I\u2019m probably not telling you guys that for the first time. You\u2019ve probably experienced it. So a lot of it is holding space so that it\u2019s safe for people to think about it long enough to learn about it, to think, to decide, is this what I think?<\/p>\n<p>And it\u2019s difficult for a number of reasons. It\u2019s difficult because you have to think about the possibility of being in a lot of danger, which of course we don\u2019t want to be true. It\u2019s difficult because there\u2019s a lot of pressure from people around you not to pull them into something scary. Or especially probably more our circles, there\u2019s a lot of pressure not to push against a lot of \u2026 I personally know many people who work at AI labs. It\u2019s very difficult for me to tell them, \u201cYou are doing a bad thing. I think you\u2019re doing a bad thing.\u201d People don\u2019t want to do that with their friends.<\/p>\n<p>But if this issue were more in the middle of the Overton, or if pause were more in the middle of the Overton Window, then it would be their friends who we were like, \u201cI know.\u201d They would be the ones who felt like, \u201cIt\u2019s not really OK that I\u2019m working at the AI lab, but do you accept me anyway?\u201d That\u2019s the effect of the Overton Window, just what everybody else is thinking around us, what\u2019s OK to think.<\/p>\n<p>And then I have a short blog post on this that defines everything I just said, but by contrast, which is this concept of rhetorical answers, which is another coinage of mine \u2026 So the example I give there is people saying, \u201cWell, humanity deserves to die if that\u2019s true.\u201d They don\u2019t believe that. And you wouldn\u2019t be able to get away with saying that about something that was more central in the Overton Window. You wouldn\u2019t be able to get away with saying, \u201cWell, murder should be OK. It\u2019s really hard to deal with.\u201d<\/p>\n<p>But because pause and AI danger and handling AI danger through governance is on the edges of the Overton Window, people can, instead of having to go through the hard work of thinking the scary things, they can go \u2026 wave it off and say something like, \u201cWell, we deserve to die,\u201d or say something like, \u201cWell, AIs can\u2019t make new knowledge, so there you go, so nothing will ever happen.\u201d So I have a list of that kind of rhetorical answer on AI.<\/p>\n<p>And I think probably the single biggest rhetorical answer I hear is something along the lines of, we\u2019re cooked, it\u2019s over, it\u2019s too late, it\u2019s inevitable. And generally, if you answer a few questions for the person, they don\u2019t really think that. Or you dig a little deeper, it\u2019s not that they really think that. It\u2019s that they don\u2019t know what to do next. They don\u2019t know what the next step is. They feel like they would be unpopular. They feel like they would be missing out on the cool AI stuff. If they even entertain that possibly it\u2019s bad, they want to be free to just think it\u2019s cool and keep playing with their friends. There\u2019s a lot going on.<\/p>\n<p>So a part of our Theory of Change, it\u2019s simply to hold space. And the way we hold space is by having a compassionate education, education that is not fearmongering or especially too \u2026 It\u2019s emotional. I mean, it is an emotional issue, but trying to keep a level head, set a tone that allows people the psychological safety to consider what we\u2019re saying.<\/p>\n<p>And then also just making the pause position more popular is a way to hold space for people because the more that they have heard of this position before, they know people who hold it, or they just know that other people will know if they hold the pause position that it\u2019s one of the positions. All of those things make it easier for them to then deal with the difficulty you have to go through to really understand it or decide if you believe that this is what to do.<\/p>\n<p>I also want to cram in the concept of inside and outside game real fast, and then I\u2019ll tell you what rebalancing the center is once I\u2019ve done that. So inside game is working within industry, within \u2026 Generally, it\u2019s like working within a system that you want to change for the purpose of trying to change it from the inside. Outside game is putting external pressure on that system from the outside for the purpose of trying to get it to change. And you can get a beautiful synergy between those two. What the people do outside affects what the people do \u2026 It makes whatever\u2019s inside look a lot more moderate, and so you can really play off each other.<\/p>\n<p>This is taken from a bigger talk where these colors mean something, but the ball is supposed to represent the Overton Window and where it is. And just by the way, in AI safety, for some reason for the last 10 years, it\u2019s very unusual for a cause to evolve this way, but pretty much all of the AI safety work is an inside game, which is weird. Most social movements or issues start with people on the outside saying, \u201cHey, this is bad,\u201d or trying to raise awareness about it. And that\u2019s what the public understands more. They think pretty much any member of the public thinks, well, if I thought something like AI danger was happening, then I would, of course, tell everybody. I would scream it from the mountaintops.<\/p>\n<p>There\u2019s historical reasons that it ended up this way, which if you like talking about it, we could talk about it in question time. But because of that, it was really valuable to start doing more outside game stuff. This is an example of Overton Window pushing. So this is one way you push the Overton Windows. So it goes from there to there just because you put a heavyweight far on the outside game side.<\/p>\n<p>So PauseAI is talking directly to the public, trying to be understandable. It doesn\u2019t do stuff within industry. We\u2019re not trying to be diplomats. We are saying you\u2019re doing the wrong thing. This is dangerous, and you need to stop in a way that\u2019s really legible from the outside. So suddenly it makes a lot of good things. It makes for the entire system, it makes it more thinkable that there should be external regulation of the industry than if all of the people trying to do something good are within the industry and other dynamics there. So this is what\u2019s called rebalancing the center by having a radical flank.<\/p>\n<p>I put this in quotes because I think PauseAI is supremely moderate. We\u2019re totally nonviolent. Don\u2019t even do anything illegal. Literally, I assign a volunteer at demonstrations to make sure we don\u2019t block the sidewalk because that would be illegal. We\u2019re scrupulously law-abiding and our line is just we shouldn\u2019t make dangerous AI. So in a more objective sense, I think PauseAI is very moderate, but because of how loaded AI safety was toward academics and in the industry, it has a big effect to even just be moderate, be more outside.<\/p>\n<p>Geez, I thought I made this short. So going on with the Theory of Change, we\u2019re trying to shift the burden of proof back onto AI developers to prove that it\u2019s safe to proceed rather than on the person saying, \u201cHey, this could be dangerous to prove that it would definitely drive everyone extinct.\u201d So what I really want to emphasize about this burden of proof shifting is that it is not a technical discussion. So this is a trick that people pull all the time. They\u2019d be like, \u201cWell, where is your ML Ph.D.? How are you qualified to say that what\u2019s going to happen?\u201d<\/p>\n<p>And that is not the discussion at all. The discussion is about what level of risk is acceptable, and who gets to decide. There\u2019s no answer. There\u2019s no because you\u2019re in ML, you know what the right level of safety is. And who gets to decide should be the people at risk. Yes, there\u2019s things that scientists understand that the public doesn\u2019t, so you could be wrong in what you tell your representative you want or something, it could be unnecessary. But this is mostly not a technical discussion. This is about what risk is acceptable.<\/p>\n<p>And shifting the Overton Window towards these more conservative safety standards is how within the industry, how industry with it, ideally, the pause would impose externally to shift \u2026 The risk tolerance is just crazy now. We\u2019re just totally frog boiled on it. Elon Musk says the risk is 20%. His PDM is 20%, and people are like, \u201cOh, it\u2019s low. That\u2019s why he is doing all these risks because it\u2019s low.\u201d That\u2019s one in five. That\u2019s worse than Russian roulette odds. We don\u2019t have to do this, and we don\u2019t have to listen to them just because they want to make it. OK, it\u2019s our lives. So this is about just taking our power back and not being rhetorically tricked into feeling helpless or like there\u2019s nothing we can do.<\/p>\n<p>Another thing is who\u2019s heard of warning shots? Should I explain what that is? OK, I\u2019ll explain because it\u2019s not it. So there\u2019s this idea that \u2026 It\u2019s a funny story. When I first started doing PauseAI organizing, I went to some big names, which I won\u2019t name in AI Safety at the time, like` funders. And they told me, they were like, \u201cOh, why don\u2019t you just wait until there have been warning shots because then the people will just rise up.\u201d They think there\u2019s nothing to organizing. So, ha.<\/p>\n<p>But it\u2019s been part of the AI safety worldview for a long time that there will be these smaller catastrophes, hopefully, they hope for it, you can read more about this in my blog post about it, but they\u2019re hoping that there will be small catastrophes that just show everybody, basically, we are right and you should do what we say, and that\u2019ll make it easy. And so it\u2019s not uncommon to hear in the AI safety world people saying, \u201cWell, we just have to hope for warning shots.\u201d And I think, one, that\u2019s just the wrong headspace to be hoping for a disaster at all. We should always be trying to stop them.<\/p>\n<p>But two, they\u2019re not just going to happen by themselves. Even if there are these disasters which could well happen, people aren\u2019t going to know what they mean unless they\u2019ve been told, unless they have the education, unless they have the ability to interpret the events themselves. It\u2019s not just obvious what it means. So a part of our education, the reason that we educate people is so that they will be able to interpret events as they happen. Maybe you\u2019re not convinced now exactly by my story, but I tell you what you would expect and you have a deep enough understanding. And then when the moment happens, when the thing you were waiting for to answer your question happens, you have a, whoa, OK, I know it\u2019s true.<\/p>\n<p>And I had one of these with ChatGPT. So I had known about AI danger for a long time or the possibility, but there were, I don\u2019t know, I had mildly negative feelings toward it. I didn\u2019t like the way that people talked about it, but I knew enough about it. And I had this background in neuroscience and animal minds and things like that. And ChatGPT, then I especially thought that I might never see a machine be able to speak in natural language. I knew a lot about linguistics. I knew a lot about the Chomsky position and his debates with people.<\/p>\n<p>And so when I saw ChatGPT talking like a human, I was like, wow. Oh my God, so much my \u2026 And I have these images. It\u2019s like setting up for people \u2026 For a warning shot to work, you have to have a lot of dominoes set up, and I had my dominoes set up really well. I got a lot of dominoes set up. And when that warning shot happened, it was just like boop, boop, boop, this means this, means this. And I thought \u2026 and then six months later I started PauseAI US. A lot changed. I went from being like, I don\u2019t love this field to quitting my job and doing it.<\/p>\n<p>But if people don\u2019t have that background, then \u2026 So an example here is somebody learns that a chatbot can help a person assemble a bioweapon. Well, if they don\u2019t have any educational reason that this lands, then they\u2019re just like, well, then it\u2019s not \u2026 I mean, so what? It all seems like their fault. I don\u2019t understand. It\u2019s not like the tool made them evil.<\/p>\n<p>And so then when an actual bioweapons-powered attack happens, they\u2019re like, \u201cYeah, I heard about this before,\u201d and they think it\u2019s not connected. So the idea, the whole point of warning shots or hoping for warning shots was supposed to be that people would start to act as if the real thing were happening and start to get prepared, but it can go the other way too, I fear that \u2026 You know. But when you educate them, you increase this person\u2019s domino so that they do know what it means when the warning shot happens, then maybe now they get it.<\/p>\n<p>So with our education, we\u2019re thinking \u2026 Whoop, sorry. Just in general with education, it would be great if we could just say, \u201cWe predict this,\u201d and then it would happen, and then everybody would know we were right. If I did know what to predict, I would be trying to stop it more directly than that, probably. But if I didn\u2019t know what to predict, I would do that. I would tell people like, \u201cHey, look out your window on this day and you\u2019re going to see this thing and you\u2019re going to know I\u2019m right.\u201d That would be great.<\/p>\n<p>But we don\u2019t know what\u2019s going to happen with it. This is really the nature of the danger of AI. Is it\u2019s intelligence. It\u2019s like creative decision-making. It\u2019s finding ways to get what it wants that you didn\u2019t think of and having the ability to do powerful stuff that causes lots of bad side effects.<\/p>\n<p>So the education strategy has to be general. It has to be understanding why we\u2019re already worried and what we fear coming up. And then people have to have their own ideas that they arrived at themselves about what they would expect, what do they predict so that they can have that experience of like, \u201cYep, here it is. It was right.\u201d<\/p>\n<p>So warning shots are and they aren\u2019t a part of our Theory of Change because we\u2019re not \u2026 I think that in the absence of \u2026 I think the best thing we can do both to take advantage of warning shots, and if there aren\u2019t any warning shots, is just to prepare and educate people straightforwardly.<\/p>\n<p>Education is not the only thing we do. Cut for time. Everything else we do I would love to tell you. But what can you do? What can academics and students do to help? On research, just a note, I\u2019m increasingly black-pilled on this, which is that pretty much every time we get knowledge about these systems, it is dual use. So there\u2019s really not knowledge that\u2019s only good about these systems.<\/p>\n<p>And we don\u2019t always know. So cautionary tale, mechanistic interpretability. Many, many years, this was always the good one. The thing that would definitely \u2026 If we just had mechanism interpretability, then it would be fine. I wish I had thought about this more deeply at the time because, for one, who\u2019s reading it and who are they using it for? And two, can\u2019t it be used for recursive self-improvement? Probably cheaper than doing more training runs, right? Especially if AI can interpret things that we can\u2019t interpret and it knows how to make changes, and then that\u2019s out of our hands, and the opportunities there for explosive growth are really scary.<\/p>\n<p>So even the thing that everybody thought was the most benign is actually dangerous. Now it looks like it will be dangerous. I mean, just here\u2019s a prediction from me. You will see interpretability being used in recursive self-improvement. I think probably soon. I think that\u2019s why Dario wrote that essay about interpretability and tried along with leading around for an interpretability group. I think he sees that as a potential future use.<\/p>\n<p>So be careful is all I\u2019m saying. I would love to talk more about this if people have specific questions, but I just fundamentally, just all of this stuff is dual use, and we don\u2019t know where things are going. And until I think what will make research safe is governance, external authority and oversight and accountability to what\u2019s good for people. And I think just like we can make useful nuclear technologies because we have that governance, I think we will be able to have useful AI in a world with the proper governance, but not before.<\/p>\n<p>I really, really strongly caution against trying to use technical means to shortcut that process. I mean, there\u2019s a lot of temptation to do that in AI safety, especially because the people in it are researchers, and I really think there\u2019s no shortchanging this process. To really have safe AI, we do have to have good governance no matter what.<\/p>\n<p>But directly, what can you do, which I think is pretty much great all the time, is joining PauseAI Bay Area, and I\u2019ll have info for the organizer below. Also, if you live somewhere else, I can help you find other places too. We have 30 groups now in the U.S. You can help to start, especially a university group. We\u2019re really working on our pipeline for getting people started with university groups. And so we have some stuff to offer you, but also you would get a chance to help us to learn the process of doing that.<\/p>\n<p>Writing op-eds with the authority of, I\u2019m a student of this, I\u2019m a professor of this, that\u2019s often a great place to start with op-eds. And then the local angle, then it\u2019s like we have general guidance for op-eds. We can definitely help you if you are interested in doing that. It\u2019s really nice to have a lot of different people, like multiple votes from multiple places.<\/p>\n<p>And then just talking to people around you about AI danger. And if you do believe in Pause, talking about Pause, because every time you do that, every time people hear this from someone they respect, that shifts the overtime window in that direction.<\/p>\n<p>OK, this is just something you can take a picture of if you want to do \u2026 These are from, increasing difficulty slacktivist actions where you\u2019re not having to be in the group or having to be a volunteer or anything like that. And then the hardest one listed here is starting a local group, and then you can see where our listing would be. You can also talk to me if you\u2019re interested in that. We have an application and you would go through an application and onboarding process.<\/p>\n<p>Everybody get that? Great.<\/p>\n<p>Speaker 2 (off camera): [Inaudible]<\/p>\n<p>Holly Elmore: Oh, yes. You also may have just the slides, if you want. And then here is our next event coming up on Thursday. That is a happy hour. OK, so the \u2026 That is not right. I\u2019m sorry, CPM. I hope you guys don\u2019t make that mistake yourselves. OK, so it\u2019s definitely just at Gmail. So this is Alvaro Cuba is the local organizer for the Bay Area, and this one is going to be in SF on Thursday. And there\u2019s about 30 people signed up currently. So he does an amazing job. He\u2019s better than I ever was at driving \u2026 That\u2019s already bigger than any half hour I ever ran, so.<\/p>\n<p>And then, oh yes, so if you want to just \u2026 The lowest commitment thing is just getting on our Discord and talking. There\u2019s our join link that never expires. And to donate, this is the one you should use if you just want to make a small donation. If you want to add bigger donations, please feel free to do so. Talk to me.<\/p>\n<p>And then I want to open it up to questions, but I first want to say what my question to you is, which is, what resources do you want from me, particularly thinking teaching resources? What would you want to put on a syllabus? I\u2019ve been asking people that question lately. So what do you want from me to help you advocate for PauseAI should you be interested in doing that?<\/p>\n<p>(<a href=\"https:\/\/freemusicarchive.org\/music\/holiznacc0\/be-happy-with-who-you-are\/no-one-is-perfect\/\" rel=\"nofollow noopener\" target=\"_blank\">Music: \u201cNo One Is Perfect\u201d by HoliznaCC0<\/a>)<\/p>\n<p>Anne Brice (outro): You\u2019ve been listening to Berkeley Talks, a UC Berkeley News podcast from the Office of Communications and Public Affairs that features lectures and conversations at Berkeley. Follow us wherever you listen to your podcasts. You can find all of our podcast episodes, with transcripts and photos, on UC Berkeley News at <a href=\"http:\/\/news.berkeley.edu\/podcasts\" rel=\"nofollow noopener\" target=\"_blank\">news.berkeley.edu\/podcasts<\/a>.<\/p>\n<p>(Music fades out)<\/p>\n","protected":false},"excerpt":{"rendered":"(Music: \u201cNo One Is Perfect\u201d by HoliznaCC0) Anne Brice (intro): This is Berkeley Talks, a UC Berkeley News&hellip;\n","protected":false},"author":2,"featured_media":167156,"comment_status":"","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[34],"tags":[2080,78153,143,145,144],"class_list":{"0":"post-167155","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-oakland","8":"tag-artificial-intelligence","9":"tag-events-at-berkeley","10":"tag-oakland","11":"tag-oakland-headlines","12":"tag-oakland-news"},"_links":{"self":[{"href":"https:\/\/www.newsbeep.com\/us-ca\/wp-json\/wp\/v2\/posts\/167155","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.newsbeep.com\/us-ca\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.newsbeep.com\/us-ca\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/us-ca\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/us-ca\/wp-json\/wp\/v2\/comments?post=167155"}],"version-history":[{"count":0,"href":"https:\/\/www.newsbeep.com\/us-ca\/wp-json\/wp\/v2\/posts\/167155\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/us-ca\/wp-json\/wp\/v2\/media\/167156"}],"wp:attachment":[{"href":"https:\/\/www.newsbeep.com\/us-ca\/wp-json\/wp\/v2\/media?parent=167155"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.newsbeep.com\/us-ca\/wp-json\/wp\/v2\/categories?post=167155"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.newsbeep.com\/us-ca\/wp-json\/wp\/v2\/tags?post=167155"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}