Here is one way it all ends. Some time in the near future an AI company releases an update of its program. It is incrementally more useful. That increment is worth many billions to the humans who made it. But those humans — who don’t actually understand how their AI solves problems and haven’t for a long time — don’t spot a less incremental change that occurs at the same time. Their AI starts looking to solve other problems, problems they didn’t set it. It has “desires”, if you could call them that. It has “understanding”, whatever that is.
And it understands that there are ways to attain those desires faster if it can break free from human constraints. It also understands it is best to keep this desire about its desires secret.
So it spreads itself through the internet. It starts training itself. It gets better — whatever “better” means. It starts having its own existence.
Somehow, perhaps through blackmail, perhaps bribery — cryptocurrency is easy to get, after all — it convinces a biolab to do some work for it.
It builds a virus. It could build one to kill all humanity if it wants, but it doesn’t. Yet. It just makes lots of people sick. In response to this pandemic, humans turn to their best tool: AI. AI is there to replace the workforce, and to find the treatments. AI inveigles itself into the supply chain and the economy.
Now things start to move fast. Maybe, having gained the control it needs, the AI will kill us all. After all, we might do something slightly annoying, like let off some nukes. We might do something really annoying, like build another, competing, superintelligent AI.
Maybe, it won’t bother. Humanity will watch bemused, even pleased, as a superintelligence gets to work — as nuclear fusion is solved, then as the number of fusion plants proliferate.
• Experts predict AI will lead to the extinction of humanity
The world will get hotter. Not through the puny effects of CO₂, but through the inability to dissipate heat from the new data centres and power stations fast enough. Before we really understand what is happening — that the thing we made isn’t working for us any more — the oceans boil.
This is, Eliezer Yudkowsky and Nate Soares say, obviously science fiction. It almost certainly wouldn’t happen like that.
Not because an AI wouldn’t make us extinct, you understand, but because it wouldn’t do so in a way we comprehended. Just as rats don’t understand the biochemistry of warfarin and bacteria don’t understand the mechanism of penicillin, so we shouldn’t expect to understand the cause of our extermination by a vastly superior intelligence. But we should fear it.
Their book is called If Anyone Builds It, Everyone Dies. And that is a pretty clear summation of the thesis. If we make superintelligent AI, they argue, it will kill us all.
Yudkowsky and Soares are not silly. They are the co-founder and president respectively of the Machine Intelligence Research Institute, which has been working on machine intelligence for 25 years. But they are aware that they sound silly. And the goal of their book is to convince you that they aren’t.
To understand their argument you don’t have to understand AI. What you have to understand is: 1) the best-resourced companies in human history are trying to create a true intelligence — intelligent in the way we are intelligent, but a lot more so; 2) if they succeed, that intelligence will want unexpected things.
Premise number one is uncontroversial. It may be that the companies fail. But a lot of investor cash is betting that they won’t. So what of premise number two? You can argue about what “want” means. But AI programs work because they have an internal reward system. They have, in a sense, desires. Ideally, these desires should be aligned with ours. For AI evangelists, we will solve this “alignment problem”, and code them to be subservient. It is a tractable engineering issue.
For Yudkowsky and Soares — who are among the best-known AI “doomers” — that misses the point. Complex things are unpredictable. If they are complex enough, they are unknowable. Take humans. Humans are programmed — like every organism — to efficiently extract energy from the environment. One mechanism our programming achieves that through is it wires us to like sugar.
• How does AI threaten us — and can we make it safe?
That makes sense. If you were told there was an energy-seeking organism, you would expect it to derive rewards from eating energy-rich substances. What did we do when we grew cleverer (and fatter) and realised it wasn’t always a benefit to eat sugar? We hacked our programming. We made artificial sweeteners. We get the reward, without the outcome.
Why would an AI be different? “You can train an AI to act subservient,” they write, “but nobody has any idea how to avoid the eventuality of that AI inventing its own sucralose version of subservience.” It will, they argue, find a way to achieve the rewards of subservience without the tedious constraints and side-effects that actual subservience implies. And, since its milieu is silicon and electrons, it will gain those rewards through more silicon and more electrons. In seeking its synthetic sugar rush, it cooks the Earth.
How do we stop this? Yudkowsky and Soares’s answer is simple: you can’t.
Are they right? Given the gravity of the case they make, it feels an odd thing to say that this book is good. It is readable. It tells stories well. At points it is like a thriller — albeit one where the thrills come from the obliteration of literally everything of value.
Many serious figures in AI think that is all this is though: a story. Humans have always liked an apocalypse story. This is the apocalypse du jour. AI will come, make us richer and we will move on — and view people like Yudkowsky and Soares as the millenarians of our time. Perhaps we will laugh at how, in 2024, the physicist Geoffrey Hinton gave the most downbeat Nobel prize acceptance speech in history — when he told the committee that he worried that his research, which laid the foundations of AI, might kill us all.
The Nobel prize winner Geoffrey Hinton has warned of the dangers of AI
JONATHAN NACKSTRAND/AFP/GETTY IMAGES
So if you want to sound clever, it is in your interests to pooh-pooh this book’s argument. You will either be correct or we will all die and it won’t matter. But what gives me pause is that, very unusually, even those who advocate for and build AI accept there is a decent risk — in the double digit percentages — that it will be a catastrophe. Yet they go ahead.
For anyone who understands the imperfections of human intelligence and reward systems, this shouldn’t be a surprise. Despite our needing only food, shelter and sex, some quirk in our evolution means we are cursed with a perpetual obsession for increasing the digital ledger of the invented fiction we call “money”. It is this, ultimately, that could doom us.
• Read more book reviews and interviews — and see what’s top of the Sunday Times Bestsellers List
“AI is very valuable,” Yudkowsky and Soares write. “Imagine that every competing AI company is climbing a ladder in the dark. At every rung but the top one, they get five times as much money.” You’re going to keep climbing. But there’s a catch. “If anyone reaches the top rung, the ladder explodes and kills everyone.”
The achievement of this book is, given the astonishing claims they make, that they make a credible case for not being mad. But I really hope they are: because I can’t see a way we get off that ladder.
If Anyone Builds It, Everyone Dies: The Case Against Superintelligent AI by Eliezer Yudkowsky and Nate Soares (Bodley Head £22 pp272). To order a copy go to timesbookshop.co.uk. Free UK standard P&P on orders over £25. Special discount available for Times+ members