Weaponization
Project Maven and the 2018 Google employee revolt over Pentagon work. → Where the ethics-versus-pragmatism fight inside the labs broke into the open.
“We believe that Google should not be in the business of war.” — Opening line of the employee letter to Sundar Pichai, April 2018
The letter went out in early April 2018, and within days it had become the most-signed document in the company’s history. It was addressed to Sundar Pichai, the chief executive of Google, and it ran to a single page. The first sentence did the work of the whole thing. “We believe that Google should not be in the business of war.” Below it came a demand: cancel the contract with the Pentagon, and write a policy stating that neither Google nor its contractors would ever build warfare technology. The signatures accumulated on an internal page where anyone with a Google badge could add a name. Three thousand. Then more. The count kept climbing toward four thousand as the thing ran its course, drawn from a workforce of engineers and designers and researchers who had, by and large, come to the company to build maps and email and search, and who had not understood until that spring that some of them had also been building the perception system for a military drone.
The contract had a name that almost no one outside a few rooms at the Pentagon and a few buildings in Mountain View had heard a year earlier. It was called Project Maven.
Maven began, as these things do, with a memo. On the twenty-sixth of April, 2017, Robert O. Work, the deputy secretary of defense, signed an order establishing what the Department of Defense called the Algorithmic Warfare Cross-Functional Team. Work was a former Marine and a defense intellectual who had spent years warning that the United States was about to lose its technological edge over China and Russia, and that the thing it was losing it in was artificial intelligence. He had watched the same demonstrations everyone else had watched. He had seen the image-recognition benchmarks fall, year after year, since 2012. He understood, better than most generals, what it meant that a neural network could now look at a photograph and name what was in it more reliably than a person could, and he understood that the Pentagon was sitting on a problem the technology was built to solve.
The problem was video. American surveillance drones over Iraq and Syria were generating more footage than any human staff could watch. Analysts sat in darkened rooms staring at gray aerial feeds for hours, looking for trucks, for people, for the patterns that meant something, and they burned out at a rate that worried their commanders. Most of the footage would never be watched by anyone before it was deleted. Work’s memo charged the new team with a narrow first task: use machine learning to scan that drone footage and flag objects of interest, vehicles and buildings and people, so that the analysts could spend their time on the frames that mattered. The team was told to deliver something usable in months, not years. The goal, in the language of the memo, was to put computer-vision algorithms into the hands of warfighters by the end of the calendar year.
To do that, the Pentagon needed people who actually knew how to train computer-vision algorithms, and the people who knew how to do that did not work for the Pentagon. They worked for Google.
The arrangement that followed was not a secret in any formal sense, but it was not announced either. Google would provide its TensorFlow software and the expertise of some of its cloud engineers to help build and refine the object-detection models that would run against the drone video. The work was routed through Google’s cloud division, which under Diane Greene was trying hard to catch Amazon and Microsoft in the business of selling computing to large institutions, and a large institution did not come larger than the Department of Defense. Internal estimates of what the relationship might eventually be worth ran into the hundreds of millions of dollars. The first phase was modest by Google’s standards, a contract reported at around nine million. But it was a foot in a door that led to a building Google very much wanted to be inside, and the people negotiating it knew that the early work was a demonstration project. Prove the technology, and the larger cloud contracts would follow.
The trouble was that some of the people inside Google had spent their careers arguing that the technology should never be pointed in this direction at all.
Among them was Meredith Whittaker. She had joined Google in 2006, founded a research group called Open Research, and over a decade had become one of those figures who exist in every large company: well-known internally, plugged into a dozen mailing lists, trusted by the rank and file in a way the executives were not. She had also, by 2017, grown deeply skeptical of the industry she worked in. With the legal scholar Kate Crawford she had co-founded the AI Now Institute at New York University, a research center devoted to the social consequences of the systems Silicon Valley was racing to deploy. Whittaker had been thinking about exactly the kind of question Maven raised, what happens when a probabilistic system trained on messy data is handed authority over consequential decisions, long before most of her colleagues knew the contract existed.
When the existence of Maven began to circulate inside Google in late 2017 and early 2018, first as rumor and then as fact, it landed in a workforce that had been told, repeatedly and in many forms, that it was special. Google’s founders had written, in the letter that accompanied the company’s stock-market debut in 2004, that Google was not a conventional company. Its informal motto, for years, had been “Don’t be evil.” Employees had been encouraged to see themselves as participants in the enterprise, to push back, to raise their hands at the Friday all-hands meetings and ask hard questions of the executives at the front of the room. For a long time this had been mostly a matter of culture and perks. Maven turned it into a matter of conscience.
Whittaker did what organizers do, which was to find the others. She posted to the company-wide lists, raised the contract at the Friday all-hands, and gathered the engineers who shared her unease into a conversation that grew week by week from a handful of people into a movement with a name and a demand. She had a gift for translating a technical objection into a moral one without losing either half, and she understood the company’s machinery from the inside: which lists were read by whom, how a rumor became a story, how a story became a problem the executives could not ignore. The petition that landed on Pichai’s desk in April did not appear from nowhere. It was the visible end of months of patient, unglamorous work by people who had decided the question was worth their jobs.
The internal arguments started on the mailing lists, where they always started, and then moved into the open. Engineers who had built image classifiers for Google Photos found themselves looking at the same techniques in a new light. The technical distance between sorting a user’s vacation pictures and identifying a vehicle in a drone feed was, they realized, almost nothing. The same convolutional networks, the same training loops, the same loss functions that had won ImageNet six years earlier were the engine in both cases. What changed was the label on the output, and the label on the output now read, in some indirect but undeniable way, target.
Defenders of the contract inside the company made an argument that was not unreasonable. Maven, as scoped, did not build a weapon. It flagged objects in video for human analysts. There was a person in the loop, always; the system did not fire anything, did not decide anything, did not so much as draw a box around an object without a human reviewing it. The Pentagon was going to buy this capability from someone. Better, the argument went, that it come from a company with Google’s values and Google’s caution than from a defense contractor who would build it without a second thought. And the United States military was not some abstract menace; it was the institution that, for better and worse, underwrote the security under which Google and everyone else got to do their work.
The objectors were unmoved, and their reason was about the next contract more than this one. The technology did not stop where the contract stopped. A system that could find a truck in a drone feed today was a system that could be wired to a targeting decision tomorrow, and the engineers building it would have no say in how it was used once it left their hands. They had seen, in a hundred quiet results, what these systems could do that no one had told them to do. The whole premise of deep learning was that you did not program the behavior; you trained it, and it generalized in ways you could not fully predict. To hand that capability to a military and trust that it would stay inside a carefully worded statement of work was, to the objectors, a category error about how the technology worked.
The fault line ran straight through the company’s own research leadership, and it produced the most personally bruising episode of the whole affair. Fei-Fei Li, who six years earlier had built ImageNet, the dataset that had made all of this possible, was now Google Cloud’s chief scientist for artificial intelligence. She had spent her career arguing that the field needed more humanity in it; she would later found a center at Stanford devoted to human-centered AI. And in the spring of 2018 her internal emails about how to handle the Maven contract became public, leaked to The New York Times, and they did not read the way her public posture would have predicted.
In one of them, Li counseled her colleagues on how to talk about the project. Avoid, she wrote, any mention or implication of AI. Weaponized AI, she went on, was probably one of the most sensitive topics in the field, and if the press got hold of the idea that Google was helping the Pentagon with AI surveillance it would do huge damage to the company. She suggested leaning on the language of cloud infrastructure instead. The advice was, in one reading, a communications professional managing a communications risk. In another reading, the one that spread across the internet within hours of the story’s publication, it was the person who had done more than almost anyone to bring computer vision into the world advising her company to hide from the public exactly what it was doing with it. Li said afterward that the emails had concerned a specific, narrow product and that she remained committed to ethical AI. The damage she had warned about arrived anyway, and some of it landed on her.
By April the dissent had outgrown the mailing lists. The letter to Pichai gave it a single demand and a single sentence, and the signatures piled up faster than anything in the company’s memory. But a letter could be received and filed. What could not be filed was people leaving, and people began to leave. Over the spring, roughly a dozen employees resigned in protest of the contract. It was a small number against a workforce of tens of thousands, but a loud one, because resignation in protest was a thing that happened at universities and newspapers, not at the company that topped every list of the best places in the world to work. Each departure was a small public refutation of the idea that the perks and the mission and the stock had made these people into the kind of employees who would build anything they were told to build. Outside the company, more than a thousand academics signed a letter of their own backing the dissenters.
The pressure that mattered, in the end, was not the resignations or even the four thousand signatures. It was that the dissent had become a story, and the story had a shape Google could not control. A company whose entire brand rested on being the benevolent face of technology was being described, in headlines, as a contractor for the machinery of lethal force. Whittaker and the other organizers had understood this from the start. They did not need to win a policy argument with the cloud division. They needed only to make the contract more expensive to keep than to cancel, and the currency that mattered here was reputational. The cost was to the company’s image of itself.
On the first of June, 2018, Google’s leadership told employees that the company would not seek to renew the Maven contract when it expired in 2019. The technology would be handed off; the relationship would wind down. It was a retreat dressed as a decision, and everyone involved understood it as such.
Six days later, on the seventh of June, Pichai published a document the dissent had effectively forced into existence. It was titled “AI at Google: Our Principles,” and it was the first time a company of Google’s scale had committed in writing to a set of rules about what it would and would not do with the technology. The principles said that Google would pursue AI only where the benefits substantially outweighed the risks, that it would not build systems that caused overall harm, that it would respect privacy and avoid unfair bias. And in a section that read like a direct answer to the letter, it listed four applications Google would not pursue at all. The first was weapons, or technologies whose principal purpose was to cause or directly facilitate injury to people. The second was surveillance violating internationally accepted norms.
The document was carefully written, and it contained a sentence the objectors noticed immediately. Google would continue, it said, to work with governments and the military in many areas: cybersecurity, training, recruitment, veterans’ healthcare. The door was not closed; it was narrowed. A company that had just promised never to build weapons had also reserved the right to keep selling the Pentagon nearly everything short of one. Whether the line between a system that finds a target and a system that does not build a weapon could survive contact with the next contract was precisely the question the principles did not answer, because it could not be answered in a document. It could only be answered, over and over, in rooms where the next deal was on the table.
The episode did not stay contained inside Google, and the same fault lines reappeared at the same company within months over a different project. By the autumn of 2018 a research scientist named Jack Poulson had resigned over Dragonfly, a secret effort to build a censored search engine for the Chinese market, complete with blacklists of banned terms and a mechanism that could tie searches to phone numbers. Poulson had asked his managers the same kind of question the Maven objectors had asked, what is this technology actually for and who gets to decide, and had not gotten an answer he could live with. The dissent that Maven had organized did not dissolve when the contract was cancelled. It had become a standing feature of the company, a constituency that would not be managed away.
What the year had revealed was bigger than any one contract or any one company. The techniques that had been refined in the pursuit of harmless-seeming goals, sorting photographs and captioning images and winning a board game, were general. That was the whole point of them; that was why they had won. A system that learned to see did not care what it was looking at. The same network that distinguished a Labrador from a husky on ImageNet could be trained to distinguish a civilian vehicle from a military one, and the only thing standing between those two uses was a decision made by people, often people far removed from the original research, under pressures the researchers never saw.
This was the property the field would come to call dual use, borrowing the term from the worlds of nuclear physics and biological weapons, and it applied to nearly everything the field produced. Face recognition could unlock a phone or identify a protester. The same generative machinery that produced Goodfellow’s phantom faces could illustrate a children’s book or fabricate a video of a head of state saying things he never said, the deepfake capability that was, in 2018, just becoming good enough to alarm people. A system that could write fluent text could draft an email or flood a social network with propaganda in a voice indistinguishable from a human one. None of these capabilities had been built for harm. All of them could be turned to it, and the turning required no new science, only a new intention and a willing supplier.
For the people who had spent their lives building these systems, this was a new kind of burden, and not all of them welcomed it. For decades the field’s defining problem had been that the technology did not work. The researchers in the wilderness years had fought for the right to be taken seriously, had begged for funding, had watched their best ideas dismissed as toys. The whole arc of the story, from the perceptron through the long winters to AlexNet and AlphaGo, had been a fight to prove that the machines could do something. Now the machines could do a great many things. The open question had become what they would be used for, and that was not a question a benchmark could settle, or an algorithm, or a paper at a conference. It was a question about people, about institutions, about power, and the people who understood the technology best were, it turned out, no better equipped to answer it than anyone else. Some were worse, because they had spent their careers believing the work was neutral.
The Google revolt of 2018 was the moment that belief broke in public. It did not break everywhere at once, and it did not break cleanly. Plenty inside the field thought the protesters had been naive, that the military would get the technology regardless, that the only effect of the rebellion was to cede the work to companies with fewer scruples. Amazon and Microsoft kept their defense contracts and said so plainly, and the Pentagon’s appetite for the technology only grew. Google’s victory, such as it was, turned out to be narrow and possibly temporary. But something had shifted, and it could be measured in a way the field understood. For the first time, a critical mass of the engineers who actually built the systems had decided that there were things they would refuse to build, and had made the refusal stick. The decision about what artificial intelligence was for had escaped the laboratory and the boardroom and landed, however briefly, in the hands of the people who made it.
What none of them could do was make the systems they kept building work against a determined human being. The Maven objectors had drawn a line at what the company would create. The harder discovery, waiting in the same year on the same kind of platform, was that even the systems Google and Facebook fully intended to deploy could not reliably do the one job their executives kept promising the public they would do.