Kokotajlo co-authored AI 2027 after hosting a series of war games with participants from leading AI labs—exercises where players were encouraged to imagine out, repeatedly and specifically, what actions might follow from present events. But he cautions that the scenario is just that: a best guess at where the world might be headed; one possible path among many. Trying to predict exactly what a superintelligence might do, the scenario notes, is like trying to predict what moves a chess grandmaster might choose to beat you with.
But Kokotajlo doesn’t want his essay to lead to resignation. Quite the opposite. AI 2027 has an alternate ending. In that one, humanity recognizes that it is ceding control over its future to an unreliable entity. Political pressure succeeds in forcing the U.S. government and top AI companies to reject the seductive appeal of handing over power; the U.S. and China strike an arms control-style deal. AIs still exist, but they are forced to reason in language, not neuralese, making them less intelligent but more aligned. The world is still chaotic and unequal; the question of who exactly should have the power to control AI remains a contested one. But humanity has avoided the worst-case scenario.
AI 2027 made a huge splash upon its release this April. Perhaps most notably, U.S. vice president J.D. Vance said he had read it. So have the leaders of top AI companies, according to Kokotajlo. The essay received its fair share of pushback—some from people who ridiculed it as science fiction, or counterproductive, or too confident in its predictions. (On that last critique, Kokotajlo says his timing is uncertain; the events of “AI 2027” could happen some other time in the next decade instead.) Overall, though, he’s grateful for the spirited debate. “Everyone needs to be waking up and paying attention to this,” he says, referring to the possibility of superintelligence. “The companies aren’t going to do it by themselves.”