No, AI isn’t going to kill us all, despite what this new book says

The rise of artificial intelligence has increased demand for data centres like this one in London

Jason Alden/Bloomberg via Getty Images

If Anyone Builds It, Everyone Dies
Eliezer Yudkowsky and Nate Soares (Bodley Head, UK; Little, Brown, US)

In the totality of human existence, there are an awful lot of things for us to worry about. Money troubles, climate change and finding love and happiness rank highly on the list for many people, but for a dedicated few, one concern rises above all else: that artificial intelligence will eventually destroy the human race.

Eliezer Yudkowsky at the Machine Intelligence Research Institute (MIRI) in California has been proselytising this cause for a quarter of a century, to a small if dedicated following. Then we entered the ChatGPT era, and his ideas on AI safety were thrust into the mainstream, echoed by tech CEOs and politicians alike.

Writing with Nate Soares, also at MIRI, If Anyone Builds It, Everyone Dies is Yudkowsky’s attempt to distil his argument into a simple, easily digestible message that will be picked up across society. It entirely succeeds in this goal, condensing ideas previously trapped in lengthy blog posts and wiki articles into an extremely readable book blurbed by everyone from celebrities like Stephen Fry and Mark Ruffalo to policy heavyweights including Fiona Hill and Ben Bernanke. The problem is that, while compelling, the argument is fatally flawed.

Before I explain why, I will admit I haven’t dedicated my life to considering this issue in the depth that Yudkowsky has. But equally, I am not dismissing it without thought. I have followed Yudkowsky’s work for a number of years, and he has an extremely interesting mind. I have even read, and indeed enjoyed, his 660,000-word fan fiction Harry Potter and the Methods of Rationality, in which he espouses the philosophy of the rationalist community, which has deep links with the AI safety and effective altruism movements.

All three of these movements attempt to derive their way of viewing the world from first principles, applying logic and evidence to determine the best ways of being. So Yudkowsky and Soares, as good rationalists, begin If Anyone Builds It, Everyone Dies from first principles too. The opening chapter explains how there is nothing in the laws of physics that prevents the emergence of an intelligence that is superior to humans. This, I feel, is entirely uncontroversial. The following chapter then gives a very good explanation of how large language models (LLMs) like those that power ChatGPT are created. “LLMs and humans are both sentence-producing machines, but they were shaped by different processes to do different work,” say the pair – again, I’m in full agreement.

The third chapter is where we start to diverge. Yudkowsky and Soares describe how AIs will begin to behave as if they “want” things, while skirting around the very real philosophical question of whether we can really say a machine can “want”. They refer to a test of OpenAI’s o1 model, which displayed unexpected behaviour to complete an accidentally “impossible” cybersecurity challenge, pointing to the fact that it didn’t “give up” as a sign of the model behaving as if it wanted to succeed. Personally, I find it hard to read any kind of motivation into this scenario – if we place a dam in a river, the river won’t “give up” its attempt to bypass it, but rivers don’t want anything.

The next few chapters deal with what is known as the AI alignment problem, arguing that once an AI has wants, it will be impossible to align its goals with that of humanity, and that a superintelligent AI will ultimately want to consume all possible matter and energy to further its ambitions. This idea has previously been popularised as “paper clip maximising” by philosopher Nick Bostrom, who believes that an AI tasked with creating paper clips would eventually attempt to turn everything into paper clips.

Sure – but what if we just switch it off? For Yudkowsky and Soares, this is impossible. Their position is that any sufficiently advanced AI is indistinguishable from magic (my words, not theirs) and would have all sorts of ways to prevent its demise. They imagine everything from a scheming AI paying humans in cryptocurrency to do its bidding (not implausible, I suppose, but again we return to the problem of “wants”) to discovering a previously unknown function of the human nervous system that allows it to directly hack our brains (I guess? Maybe? Sure.).

If you invent scenarios like this, AI will naturally seem terrifying. The pair also suggest that signs of AI plateauing, as seems to be the case with OpenAI’s latest GPT-5 model, could actually be the result of a clandestine superintelligent AI sabotaging its competitors. It seems there is nothing that won’t lead us to doom.

So, what should we do about this? Yudkowsky and Soares have a number of policy prescriptions, all of them basically nonsense. The first is that graphics processing units (GPUs), the computer chips that have powered the current AI revolution, should be heavily restricted. They say it should be illegal to own more than eight of the top 2024-era GPUs without submitting to nuclear-style monitoring by an international body. By comparison, Meta has at least 350,000 of these chips. Once this is in place, they say, nations must be prepared to enforce these restrictions by bombing unregistered data centres, even if this risks nuclear war, “because datacenters can kill more people than nuclear weapons” (emphasis theirs).

Take a deep breath. How did we get here? For me, this is all a form of Pascal’s wager. Mathematician Blaise Pascal declared that it was rational to live your life as if (the Christian) God exists, based on some simple sums. If God does exist, believing sets you up for infinite gain in heaven, while not believing leads to infinite loss in hell. If God doesn’t exist, well, maybe you lose out a little from living a pious life, but only finitely so. The way to maximise happiness is belief.

Similarly, if you stack the decks by assuming that AI leads to infinite badness, pretty much anything is justified in avoiding it. It is this line of thinking that leads rationalists to believe that any action in the present is justified as long as it leads to the creation of trillions of happy humans in the future, even if those alive today suffer.

Frankly, I don’t understand how anyone can go through their days thinking like this. People alive today matter. We have wants and worries. Billions of us are threatened by climate change, a subject that goes essentially unmentioned in If Anyone Builds It, Everyone Dies. Let’s consign superintelligent AI to science fiction, where it belongs, and devote our energies to solving the problems of science fact here today.

No, AI isn’t going to kill us all, despite what this new book says

Tags: