Explosion
AlphaGo beats Lee Sedol in Seoul in 2016; Move 37; hundreds of millions watch in China. → The most public demonstration of deep learning's power — and the start of the US–China race.
“It’s not a human move. I’ve never seen a human play this move.” — Fan Hui, on AlphaGo’s 37th move
The 37th move came late in the morning of March 10, 2016, in a function room on the sixth floor of the Four Seasons hotel in Seoul. Aja Huang, a young DeepMind engineer in a dark suit, looked at the screen of the laptop in front of him, read off the coordinates the machine had chosen, and reached across the board to place a black stone on the fifth line of the right side. Then he sat back and waited, because that was the entire job he had been given. Huang held a doctorate and had spent years on the program that was now telling him where to play, and for the length of the match his role was to be its hands. He had no say in the move. He just put down the stone.
The room did not understand what it had just seen. On the live English-language commentary, Michael Redmond, an American who had reached the top professional rank in Japan and was one of the few Westerners qualified to narrate a match at this level, stopped mid-sentence. He moved the stone on his demonstration board, took it off, put it back, as if checking that he had heard the coordinates right. A shoulder hit on the fifth line was not something a strong player did. The orthodoxy, drilled into every serious student of the game for centuries, was that you played that kind of contact move on the third or fourth line, where it had a relationship to territory and to the edge. The fifth line gave too much away. It was the kind of move a talented amateur might try and a teacher would gently correct. Redmond said he thought it was probably a mistake.
Lee Sedol was not in the room when it happened. He had stepped out for a cigarette. When he came back and saw the stone, he sat down, and then he did not move for a long time. He smiled the tight, unhappy smile of a man who has been handed a problem he does not recognize. Photographs from that minute show him with one hand on his neck and his eyes fixed on the board. He took close to fifteen minutes to answer. Lee was thirty-three years old, the winner of eighteen world titles, a player whose nickname inside the Korean game was the Strong Stone for the violence of his fighting style. He had spent his life since the age of five doing nothing but this, and he could not work out what the machine wanted.
Fan Hui could. Fan was watching from inside the DeepMind camp, an awkward place for him to be, because five months earlier this same program had beaten him five games to nothing and ended, in a quiet way, the idea that he was a serious obstacle to it. Fan was the European champion, a professional of real but not elite strength, and after his defeat DeepMind had hired him to help test the system before Seoul. He had played hundreds of games against it by now. He looked at the 37th move and felt the hair on his arms stand up. It was not a human move, he said later. He had never seen a person play it. And it was beautiful.
The probability that a human professional would have chosen that move, the program’s own internal estimate, was about one in ten thousand. AlphaGo had not played it because it copied a master. It had played it because, after teaching itself the game across millions of self-played games, it had concluded that this was simply where the stone belonged, and that the centuries of accumulated wisdom telling players to keep off the fifth line had a blind spot in this exact position. Hours later, when the fighting reached the part of the board where the stone sat, the move turned out to be the hinge of the whole game. It had been working the entire time. The humans in the room had been the ones who could not see it.
To understand why this stopped the technology world cold, it helps to know what Go had meant inside artificial intelligence for the previous twenty years. Chess had fallen in 1997, when IBM’s Deep Blue beat Garry Kasparov, and the lesson the field took from that was deflating rather than inspiring. Deep Blue won by brute force, by searching tens of millions of positions a second and applying handcrafted rules about which ones were good, and almost no one believed it understood chess in any meaningful sense. Go was supposed to be immune to that approach. The board is nineteen by nineteen, and the number of legal positions is larger than the number of atoms in the observable universe, a comparison that gets repeated so often it has lost its force but happens to be roughly true. You cannot search your way through Go. There are too many branches. Strong human players talk about feeling the shape of a position, about whether a wall of stones looks thick or thin, about influence and aji and other concepts that resist being written down as rules. For decades the best Go programs played at the level of a decent club amateur and got stuck there. The consensus among researchers, repeated confidently in interviews well into the 2010s, was that a machine beating a top human at Go was at least a decade away, possibly much further.
The man who decided to attack it anyway had been pointing his company at games since he founded it. For Demis Hassabis, Go was a clean test of whether a machine could acquire something that looked like intuition, the same instinct that had run from his childhood at the chessboard through DeepMind’s early work teaching networks to play Atari from raw pixels. By 2014 DeepMind belonged to Google, and Google’s compute and money were now behind the question. The team Hassabis assembled around it was led on the research side by David Silver, a reinforcement-learning specialist who had known Hassabis since their student days and had spent years on the unglamorous problem of how a program could learn from the consequences of its own actions rather than from a human telling it the answer.
What Silver’s group built was a marriage of two ideas. One was the deep neural network, the technology that had already swept through image recognition after AlexNet, here trained to look at a Go board and output two judgments at once: which moves were worth considering, and who was winning. The other was a search method called Monte Carlo tree search, which explores possible continuations not exhaustively but by sampling, playing out lines of play and keeping the ones that lead somewhere good. Neither was new on its own. The leap was using the network to prune the search down to the handful of moves a strong player would actually weigh, so the program spent its effort where it mattered instead of drowning in branches. And the network had been trained, after an initial diet of human games, by playing against versions of itself, over and over, millions of times, discovering through those games what worked. That self-play was why the 37th move could be both alien and correct. The program had no commitment to human convention. It had only the record of what had won and lost in its own enormous private history of the game.
In October 2015 AlphaGo had played Fan Hui in secret, in DeepMind’s London office, and won every game. The result was held back until a paper appeared in the journal Nature in January 2016, and even then much of the Go world was skeptical. Fan Hui was a fine player but not in the top tier, and beating him was a long way from beating a player like Lee Sedol. The gap between a strong professional and a world champion in Go is wide and felt, by those inside it, to be qualitative. So when DeepMind challenged Lee to a five-game match in Seoul, with a million dollars on the line, the betting among professionals was lopsided. Lee himself predicted he would win five to nothing, or perhaps four to one if the machine surprised him. He was relaxed about it in the days before. He was, after all, the Strong Stone.
He lost the first game. He had come out playing carefully, testing, and the machine had simply ground him down and forced his resignation. The mood in the Korean press shifted overnight from amusement to unease. Then came Game Two and the 37th move, and the unease became something closer to grief. Watching the broadcast, you could see Lee’s confidence drain out of him in real time. He resigned that game too, and afterward, at the press conference, he apologized. He said he was speechless, that he had never felt so much pressure, and that the loss was his own failing rather than the game’s. There was a national quality to the disappointment. South Korea takes Go seriously in a way that is hard to convey to people who think of it as an obscure pastime; Lee was a household name, and his struggle was being treated as a contest on behalf of human beings in general.
Game Three he also lost, and the match was decided: AlphaGo had won the best-of-five with two games to spare. Whatever happened next was, in the cold accounting of the prize, an exhibition. Which is what makes Game Four the strangest and most human chapter of the whole week.
On March 13, with nothing left to win, Lee played the fourth game loose and aggressive, the way he played when he had stopped trying to be careful. Deep into the middle game, with the position looking lost, he found a move that has its own legend now. It was the 78th move of the game, a wedge dropped into the gap between two white stones in the center of the board, a play so unusual that AlphaGo, by its own internal estimate, had judged the chance of a human making it at about one in ten thousand, the very same figure that had described its own 37th move two games earlier. The symmetry was almost too neat to be real. The Korean commentators called it the hand of God. After Lee played it, the machine’s responses began to wobble. AlphaGo’s evaluation of its own position swung, it answered the wedge with moves that looked, for the first time in the match, confused, and several moves later it resigned. Lee Sedol had won a game. He grinned. The room, which had spent three days watching a man lose to a machine, erupted as if the score were the other way around.
Lee lost the fifth and final game, and the match ended four to one. The result that the world remembered was not the four, though, but the one, and the two moves around which the whole thing turned, one made by the machine and one made by the man, each a play the other side would have called impossible. For a brief moment in the spring of 2016 the relationship between human and machine intelligence looked less like a defeat than like a conversation neither party had expected to be able to have.
The argument about what it all meant began before the stones were back in their bowls. To the people who had been warning, in the previous two years, that artificial intelligence was a danger precisely because it might exceed human ability faster than anyone expected, AlphaGo was a data point and a vindication. To the engineers, it was a narrow result that they were careful not to oversell; AlphaGo could play Go and do nothing else, could not tie its own shoes, had no idea it was playing a game at all. Hassabis and Silver said as much, repeatedly, in the press conferences. The system was a demonstration that a single set of techniques, deep networks plus reinforcement learning plus search, could reach superhuman performance in a domain that had been thought to require human intuition, and that was genuinely significant, but it was not a thinking being. The honest version of the achievement was modest in its claims and enormous in its implications, and most of the coverage kept only the second half.
What almost no one in the Seoul function room registered at the time was that the audience that would matter most was not in Korea, or in London, or in Silicon Valley. It was in China.
The match was broadcast and streamed across China, and the numbers, while hard to pin down precisely, were large by any measure. The audiences cited at the time ran into the tens of millions inside China and past a hundred million worldwide over the course of the week, which put AlphaGo, briefly, among the most-watched events in the long history of the game it was beating. Go is not a Korean game by origin; it is Chinese, more than two thousand years old, woven into the country’s idea of itself as a civilization of scholars and strategists. To watch a Western company’s software take the game apart, using a champion from a neighboring country as the stand-in for humanity, landed in China with a particular weight.
What the match set in motion there—the prodigy Ke Jie, the “Sputnik moment” framing, and Beijing’s national AI plan the following year—is the subject of Chapter 14. What matters here is only that the demonstration crossed the Pacific and was read, by the people who set national priorities, as a signal that the contest had begun.
There is a temptation, looking back, to draw a single clean line from a black stone on the fifth line straight to a national policy document, and the line is not clean. China’s ambitions had many sources, and a government does not reorganize its research priorities because of one Go match. But the people who lived through that week, on both sides of the Pacific, tend to mark it as the moment the contest became real to the public and to the politicians. Before Seoul, artificial intelligence was a thing that labs argued about and marketing departments oversold. After Seoul, it was a thing that two governments understood themselves to be racing over, and the race that would consume the next decade had, in front of an audience of hundreds of millions, fired its starting gun.
The match also left a smaller and stranger legacy that the larger story tends to bury. Lee Sedol, the man who had been chosen to defend the species and had won exactly one game out of five, did not take the defeat as a curiosity. He took it personally, and over the following years he concluded that the thing he had given his life to had been permanently diminished by what he had seen. In 2019 he retired from professional Go, and he said plainly why. There was an entity, he said, that could not be defeated. The point of mastering a game is partly the belief that mastery is possible and that someone can be the best at it. Lee had spent three days in a hotel room learning, move by move, that this was no longer true, and the lesson did not leave him.
AlphaGo, for its part, kept getting better and kept needing humans less. A later version, AlphaGo Zero, threw out the human game records entirely and learned the game from nothing but the rules and self-play, and surpassed the version that had beaten Lee in a matter of days. The shoulder hit on the fifth line, the move that had stopped a room full of experts, turned out to be only the first thing the machines had to teach the people who thought they understood the game. It was a spectacular result, and it was also, the engineers kept insisting, a narrow one, a program that could do a single thing perfectly and nothing else. What they could not yet see, and what would become the story of the years immediately ahead, was how quickly the same handful of techniques would stop being narrow at all, and begin swallowing one human field after another.