Part VII · Chapter 28

The Warning

In the spring of 2023, Geoffrey Hinton quits Google to speak freely and the people who built AI sign a one-sentence statement that it could cause human extinction. → The moment the field turned and warned the public about its own work.

“I console myself with the normal excuse: If I hadn’t done it, somebody else would have.” — Geoffrey Hinton, on his life’s work, The New York Times, May 1, 2023

The phone call to Sundar Pichai came on a Thursday in late April 2023, and it was, by Geoffrey Hinton’s own account, an awkward one. Hinton was seventy-five. He had worked at Google for a decade, since the company won the auction for his three-person startup in a Lake Tahoe hotel room in 2012. He held a senior title, a comfortable arrangement that let him keep one foot in Toronto, and the affection of a research organization that treated him as a founding ancestor. He was calling to quit. Not because Google had done anything to him. Because he had decided he could no longer say what he wanted to say while drawing a Google paycheck, and what he wanted to say was that the thing he had spent fifty years building might be about to slip out of human control.

He had been turning the thought over for months. What tipped him, he later explained, was not a single experiment but a change in his own intuition about what the machines were doing. For most of his career Hinton had believed that the artificial neural networks he helped invent were crude cartoons of the brain, useful but fundamentally inferior to the wet biological version. Then he started paying close attention to the large language models, and the comparison began to run the other way. A model like the one Google had built, or the one OpenAI had just released, could hold a thousand times less in its connections than a human brain held in its synapses, and yet it knew thousands of times more. It could absorb the contents of the internet in a way no person could. And because you could run many copies of the same model in parallel, each learning from different data and then sharing what it learned by averaging their weights, a population of digital minds could pool experience at a speed no group of humans could match. Biology, he concluded, might not be the superior design after all. He had spent his life trying to make machines think like people. The unsettling possibility was that he had succeeded, and that thinking like people was not where the machines would stop.

When The New York Times published the story on May 1, 2023, under the headline “‘The Godfather of A.I.’ Leaves Google and Warns of Danger Ahead,” it traveled faster than almost anything Hinton had ever published in a peer-reviewed journal. The framing was irresistible: the man most responsible for the technology now warning that the technology was dangerous. Hinton was careful, in the interview, not to blame Google. Until the previous year, he said, the company had been a “proper steward” of the technology, holding back its most powerful systems rather than releasing them into the wild. What changed was competitive pressure. Once Microsoft put a chatbot inside its Bing search engine and threatened the core of Google’s business, Google had no choice but to respond in kind. The caution that had defined the field’s industrial leaders for a decade evaporated under the discipline of a market. “It is hard to see how you can prevent the bad actors from using it for bad things,” Hinton said. His more immediate fear was not killer robots but a flood of fabricated text, images, and video so convincing that the average person would, in his words, “not be able to know what is true anymore.”

And then the line that everyone quoted, the one that captured the strangeness of a man indicting his own work. Asked whether he regretted it, Hinton reached for what he called the normal excuse, the one every weapons scientist and every dual-use researcher has reached for since the Manhattan Project. If he hadn’t done it, somebody else would have.

What made Hinton’s defection land so hard was that he was no natural alarmist. He was not Elon Musk, who had been calling artificial intelligence an existential threat since 2014 while founding companies to build it, nor a philosopher writing about hypothetical superintelligences from an Oxford office. He was the person who had kept the connectionist faith through two AI winters, who had relabeled his papers to get the word “neural” past hostile reviewers, who had watched his entire research program ridiculed and then vindicated. He had earned the right to be a believer, and for fifty years he had been one. When a believer of that standing says the thing he built may be the thing that ends us, the people who had been saying it all along suddenly had cover, and the people who had been ignoring them suddenly had to explain why.

The warning did not arrive alone. It arrived in the middle of a remarkable few weeks in the spring and early summer of 2023 when the abstract worry about artificial intelligence, confined for years to a small subculture of researchers, online forums, and a handful of Bay Area dinner parties, became a mainstream political event. The trigger had detonated five months earlier, on November 30, 2022, when OpenAI released ChatGPT and a research preview turned into the fastest-adopted consumer technology in history. By March 2023, OpenAI had shipped GPT-4, a system noticeably more capable than the one that had already reorganized the public conversation. The pace was the point. Each release was better than the last, the gaps between them measured in months, and no one outside the labs, and arguably no one inside them, could say with confidence where the curve was heading.

Eight days after GPT-4’s release, on March 22, 2023, the Future of Life Institute published an open letter calling for a six-month halt on training any system more powerful than it, the document that put existential worry on the front pages for the first time. That letter is its own story, told earlier in this book; what matters for Hinton’s moment is what it failed to do and what it changed anyway. It failed completely on its own terms. No pause happened, no lab stopped training, and the signature process became an embarrassment, with a fake entry under the name Xi Jinping and several listed signatories saying they had never signed. Yann LeCun, who had been Hinton’s postdoc in Toronto in the 1980s and now ran Meta’s AI research, dismissed the premise in public, calling the fear of superintelligent machines escaping control wildly overblown. The labs the letter sought to restrain had no intention of disarming in a race they believed they were winning.

But the letter changed who was allowed to ask the question. For years the people warning loudest about existential risk had been easy to wave away: readers of Nick Bostrom’s 2014 book Superintelligence, members of online rationalist communities, employees of nonprofits with apocalyptic mission statements. The pause letter put a Turing Award winner and the co-inventor of the personal computer on the same document, and made the worry respectable to voice in public. And once it was respectable, the people building the systems found they could no longer answer it with marketing alone.

This was the peculiar shape of 2023. The loudest voices warning about the danger of artificial intelligence were, increasingly, the people getting rich and famous by building it. Skeptics had a name for the maneuver. When Sam Altman, the chief executive of OpenAI, went before a Senate subcommittee on May 16, 2023, and asked Congress to create a federal agency that would license companies like his to build large models, the gesture could be read two ways at once. It could be the statesmanship of a founder who genuinely believed his industry was too dangerous to leave unregulated. Or it could be the oldest move in the corporate playbook, an incumbent inviting regulation he could afford and his smaller competitors could not, building a moat out of compliance costs and calling it caution. The hearing itself had been designed for theater. The subcommittee’s chair, Senator Richard Blumenthal of Connecticut, opened by playing a recording of his own voice reading an opening statement, then revealed that the voice was an AI clone and the words had been written by ChatGPT. It was a clever stunt, and it made the point that even a United States senator could be convincingly counterfeited. Altman, for his part, told the senators that “if this technology goes wrong, it can go quite wrong,” and that he wanted to be vocal about it. Coming from the man whose product had set off the entire scramble, the admission was either disarming honesty or a brilliantly executed inoculation, and reasonable people disagreed about which.

The clearest single artifact of the moment came two weeks after Altman’s testimony, on May 30, 2023, the day Hinton’s resignation was still fresh in the news cycle. A small nonprofit called the Center for AI Safety, run by a young researcher named Dan Hendrycks, released a statement. Where the pause letter had run to several paragraphs, this one was a single sentence, twenty-two words long: “Mitigating the risk of extinction from AI should be a global priority alongside other societal-scale risks such as pandemics and nuclear war.”

The brevity was deliberate. Hendrycks had reasoned that the longer a manifesto became, the more room it gave a potential signatory to find a clause they could not endorse, and the more easily a refusal could be rationalized. A single sentence collapsed the decision to a binary. You either believed that extinction-level AI risk deserved a seat at the same table as nuclear war, or you did not. There was nothing to quibble with, no policy prescription to reject, no implied criticism of anyone’s business model. It was a temperature reading disguised as a statement.

What gave the sentence its force was the list of names beneath it. Hinton signed. Bengio signed. So did the third of the three so-called godfathers of the field’s modern revival, though LeCun, the fourth name people associated with the Turing-winning trio, conspicuously did not. More striking than the academics were the executives. Altman of OpenAI signed. Demis Hassabis of Google DeepMind signed. Dario Amodei of Anthropic, the lab founded a year earlier by defectors from OpenAI who feared the technology was being built too fast, signed. The chief executives of the three labs most aggressively pushing the frontier had each put their name to a document declaring that the thing their companies were racing to build might wipe out the species. Bill Gates signed. Hundreds of researchers signed.

It was, on its face, an extraordinary document, a confession and a warning issued by the accused themselves. It was also, depending on where you stood, either a moment of rare moral seriousness in an industry not known for it, or a masterpiece of having it both ways. The same people signing the extinction statement were, that same month, raising billions of dollars, recruiting researchers with offers that rivaled professional athletes’ contracts, and shipping new models on quarterly cadences. If you genuinely believed your product might kill everyone, the logic of the critics ran, the appropriate response was to stop, not to sign a sentence about it and then return to the office. The signatories had a reply, and it was the same one Hinton had given the Times. The race was already running. Stepping out of it did not stop it; it only ceded the lead to whoever cared least about safety. Better, they argued, to be in front, where you could at least try to do it carefully, than to hand the future to a competitor who would not.

There was a real disagreement buried under the surface unanimity, and it was not the one the headlines captured. The loudest fight was framed as believers against deniers, the people who thought AI might end the world against the people who thought that was science fiction. But a quieter and arguably more important split ran through the safety camp itself, between those who worried about extinction and those who thought the extinction framing was a distraction from harms that were already happening. Researchers who studied algorithmic bias, labor displacement, and misinformation watched the spring of 2023 with growing frustration. The flood of money and press attention was flowing toward speculative futures, toward superintelligences that did not yet exist, while the documented present harms of the systems already deployed went comparatively unfunded and unregulated. To them, the extinction statement was worse than a distraction. It was a convenient one. A company that admitted its product might someday destroy humanity was implicitly granted enormous power and importance, and a company arguing about hypothetical god-machines was a company not being asked hard questions about the biased hiring tool or the deepfake or the scraped copyright it was shipping today. The grandest possible danger, in this reading, was the perfect cover for the mundane ones.

Hinton himself was not interested in playing the role of pure prophet of doom, and his refusal to perform certainty was part of what made him persuasive. Pressed for a number, he would give one, but he gave it as a confession of ignorance rather than a calculation. By the end of 2024, guest-editing a BBC radio program two days after Christmas, he would put the probability of AI causing human extinction within thirty years at somewhere between ten and twenty percent, and admit that he was essentially guessing. The image he reached for was domestic and disarming. To imagine humanity’s position relative to a superior intelligence, he said, picture yourself and a three-year-old, and understand that we would be the three-year-olds. His deeper worry was not malice but indifference, a more capable agent simply pursuing its own ends in a world where humans had stopped being the most competent things in it. “My worry,” he said, “is that the invisible hand is not going to keep us safe.” It was a strange sentence for a man to say about a technology he had spent his career advancing and a market that had made him wealthy. It was also, coming from him, hard to dismiss.

In October 2024 the establishment would deliver its own verdict on Hinton, awarding him the Nobel Prize in Physics for the foundational work on neural networks, and he would use the platform exactly as he had used his resignation, to keep talking about the risk; that turn comes later in this book. What mattered now was that the man who had quit his job to warn that his life’s work might destroy the world was about to be handed the highest honor in science for that same work.

What the spring of 2023 changed, in the end, was who carried the alarm. For most of the field’s history, the warnings about runaway machines had come from outsiders, from philosophers and novelists and a few eccentric technologists, and the people building the systems had been the optimists, brushing the worries aside as the fantasies of people who did not understand the code. Now the insiders were the worried ones, and they were worried precisely because they did understand the code, or at least understood that they no longer fully did. The letter that failed, the twenty-two-word sentence that gathered the signatures of the men running the labs, the godfather quitting his job to speak freely, all of it pointed the same direction. The people closest to the technology had looked at what they were building and decided the public deserved to be told it might be dangerous.

The harder question, the one the warnings raised but could not answer, was what anyone was supposed to do about it. A statement is not a law. A signature is not a brake. The chief executives who put their names to the extinction sentence went back to work the next morning, and the models kept getting better, and the gap between the gravity of the warning and the velocity of the development only widened. Having declared the danger, the field handed the problem to two institutions historically bad at moving quickly: governments, which would spend the next two years trying to write rules for a technology that changed faster than they could legislate, and corporate governance, which was about to be tested inside the most important company in the field. That test would come first, and it would come fast. Six months after the godfather quit Google to warn the world, a nonprofit board would try to fire the man who had asked the Senate to regulate him, and the world would learn, over a single bewildering weekend, exactly how little the warnings had actually changed who was in charge.