Part VI · Chapter 24

The Research Preview

ChatGPT launches as a low-stakes demo and becomes the fastest-growing consumer product in history within two months. → The day AI stopped being a research field and became a global phenomenon.

“ChatGPT sometimes writes plausible-sounding but incorrect or nonsensical answers.” — OpenAI, “Introducing ChatGPT,” November 30, 2022

The decision that reorganized the technology industry was made, in the way these things usually are, by people who did not think they were deciding much of anything. In the middle of November 2022, a small group of researchers at OpenAI’s office on Eighteenth Street in San Francisco’s Mission District concluded that the conversational model they had been polishing for months was good enough to put in front of strangers, and that the fastest way to learn whether it was actually any good was to let strangers use it. There was internal debate about whether to bother. Some argued the model was not ready, that it made things up too confidently, that the public would be unimpressed by a chatbot when chatbots had been a punchline for two decades. The counterargument, which won, was that none of that mattered very much because the stakes were low. It was a demo. They would call it a research preview, ship it quietly, watch how people poked at it, and use what they learned to build the thing they actually cared about, which was the next model. The whole effort came together in roughly two weeks. Internally some of them referred to it, for a while, as Chat with GPT-3.5, a name with all the romance of a directory path.

The model underneath was not new. GPT-3.5, the system being wrapped in the chat interface, had finished its training in early 2022 and had been available to developers through OpenAI’s API for months. What was new was the conversation. For two years the company had been refining reinforcement learning from human feedback, the technique described earlier in this book, by which a raw language model, which only knows how to predict the next plausible word, is taught to behave like an assistant. The raw model was a kind of savant with no manners: ask it a question and it might answer, or it might continue your question with three more questions, or it might produce a paragraph of fluent nonsense, because all it had ever been rewarded for was sounding like the internet. The work of the previous two years had been to civilize that instinct, using humans to show the model how a helpful assistant should respond and to rank its answers until it learned to prefer the ones people actually liked. OpenAI had published the recipe in January 2022 under the name InstructGPT. ChatGPT, the launch post would say plainly, was a sibling of that work.

The recipe had a cost that the launch post did not mention, and that the company would spend the next several weeks not discussing. To teach a model to refuse to describe a murder, or to produce instructions for abuse, you first have to teach it to recognize that material, which means human beings have to read it and label it. In January 2023, the TIME reporter Billy Perrigo published an investigation into where that labor had come from. OpenAI had contracted with an outsourcing firm called Sama, which employed workers in Nairobi to read and categorize some of the most violent and degrading text on the internet, descriptions of child sexual abuse, bestiality, torture, so that a classifier could be trained to keep it away from users. The workers were paid, depending on role and seniority, somewhere between about $1.32 and $2 an hour. The full set of contracts was worth on the order of $200,000. Sama ended the work early, in February 2022, with several employees describing lasting psychological harm. The phrase that sold the technology, human feedback, turned out to have a literal and unglamorous referent. The feedback was human, and these were the humans, and that was what their feedback had cost. It is the kind of detail that does not fit the launch narrative, which is exactly why it belongs in it.

ChatGPT went live on November 30, 2022. The launch post was modest to the point of self-deprecation. It described a system that could “answer followup questions, admit its mistakes, challenge incorrect premises, and reject inappropriate requests,” and it volunteered, in the same breath, that the thing it was announcing “sometimes writes plausible-sounding but incorrect or nonsensical answers.” There was no press tour, no keynote, no demo on a stage. There was a blog post and a web page where anyone could sign up for free. Sam Altman, OpenAI’s chief executive, posted a link. The expectation inside the building, by the team’s own later accounts, ranged from modest to nonexistent. “We didn’t want to oversell it as a big fundamental advance,” Liam Fedus, one of the scientists who worked on it, told MIT Technology Review a few months later, and the company, by the same account, “had few expectations” going in. One forecast that circulated afterward put the internal guess for the first week somewhere in the low tens of thousands of users. They had built a demo to learn something, and they were about to learn something much larger than they had asked.

What happened next is the part everyone remembers, and it is genuinely strange that it happened at all, because nothing about the product was, on paper, supposed to travel. There was no app. There was no social feed, no like button, no mechanism for one user’s activity to be seen by another. ChatGPT was a blank box on a website. And yet within hours, people who had typed something into the box could not stop showing other people what came back. They posted screenshots. A man asked it to write a biblical verse in the style of the King James Bible explaining how to remove a peanut butter sandwich from a VCR, and it obliged, in cadence, with the gravity of scripture. Software engineers fed it broken code and watched it explain the bug. Students discovered, with a mixture of delight and dawning alarm in the registrar’s office, that it would write a passable five-paragraph essay on the causes of the First World War in eleven seconds. Lawyers asked it to draft demand letters. Parents asked it to settle bedtime arguments. The thing the chat interface did, that the API underneath had never done for ordinary people, was remove the price of admission. You did not need to know what a token was, or what an API key was, or what a large language model was. You needed to be able to type a sentence and read the answer, and almost everyone alive could do that.

On December 5, 2022, five days after launch, Altman posted that ChatGPT had crossed a million users. He sounded a little stunned in the wording, and the company would later say it had taken about that long to reach the figure, though OpenAI never published a precise audited count and the round number should be read as the founder’s tweet that it is rather than a balance-sheet fact. Whatever the exact total, the curve was unmistakable. The servers buckled almost immediately. Anyone who used ChatGPT in those first weeks remembers the apology screen, the cheerful little message that the system was “at capacity right now,” sometimes accompanied by a limerick the model had been told to generate to soften the rejection. OpenAI had built a research preview and accidentally shipped one of the most demanded consumer services on the internet, and the infrastructure underneath, every query of which ran on expensive graphics processors leased from Microsoft’s data centers, could not keep up with the want. Altman would say, half-joking, that the cost of running it was eye-watering, that each conversation cost the company real money, that they were going to have to monetize somehow because the compute bill was not sustainable. OpenAI had the rare problem of a product that worked far better, and cost far more, than its makers had planned for.

By the end of December 2022, ChatGPT was drawing something like 57 million monthly users, a figure later assembled by analysts from web-traffic data. By late January 2023 it was pulling roughly 13 million unique visitors a day, more than double the December rate. Then came the number that fixed the moment in the historical record. On February 1, 2023, a UBS analyst named Lloyd Walmsley circulated a note, leaning on data from the analytics firm Similarweb, estimating that ChatGPT had reached roughly 100 million monthly active users in January, about two months after launch. In the same note Walmsley reached for a comparison, and the comparison is what made headlines around the world: nothing in the history of consumer internet software had grown this fast. TikTok, the previous record holder, had taken roughly nine months to reach 100 million users. Instagram had taken about two and a half years. ChatGPT had done it in two. The phrase “fastest-growing consumer application in history” attached itself to the product and stayed there, repeated so often it stopped sounding like a claim and started sounding like a definition. It was an estimate built on third-party traffic modeling rather than an official OpenAI disclosure, a caveat worth keeping, but the order of magnitude was not seriously contested by anyone watching their own analytics.

Underneath the record was a question OpenAI had not quite expected to have to answer, which was who would pay for any of this. The company had a business model on the developer side, where firms paid by the token for access to the underlying models. It had no business model at all for a hundred million people typing into a free box. In early February 2023, it introduced ChatGPT Plus, a $20-a-month subscription that promised priority access during peak hours, faster responses, and first crack at new features. The pitch, baldly, was a way to skip the “at capacity” screen. It was the first time in the saga that the gap between research lab and consumer company became a line item. A subscription tier is the sort of thing organizations create when they have stopped thinking of themselves as a lab running an experiment and started thinking of themselves as a company with customers, and OpenAI crossed that line within ten weeks of a launch it had described as low-stakes.

There is a temptation to treat all of this as a story about a clever product decision, the insight that wrapping a model in a chat box would unlock it. That undersells how little of it was planned. The people who shipped ChatGPT have been admirably honest, after the fact, that they did not see it coming. The model was no better than the one developers had already been using; it was the same model, dressed for company. Chatbots were so old they were a joke, so the interface counted as nothing new. And there had been no campaign behind the launch. It was a Wednesday. What the team had, almost by accident, was the right thing offered to the right number of people with no barrier between them and it, at a moment when the underlying technology had quietly crossed a threshold of usefulness that no one outside a handful of labs had registered. For two years the scaling bet had been playing out in research papers and developer documentation, legible only to people who already knew to look. ChatGPT translated it into a sentence anyone could type, and the translation was the whole event.

The most pointed fact about the launch is that OpenAI was not the only lab that could have done it, and was arguably not even the best positioned to. Google had built dialogue models that impressed its own engineers years earlier; one of them, LaMDA, had been demoed onstage in 2021 and had so unsettled an engineer named Blake Lemoine that he publicly insisted it was sentient. DeepMind, Google’s London lab, had described a careful, citation-producing chatbot of its own, called Sparrow, in a research paper that September, and had pointedly declined to put it in front of the public, on the grounds that a conversational model that could lie fluently was not something a responsible lab released into the wild. Both arguments were correct on their own terms. A model that hallucinates is a liability, and the larger and more profitable your existing business, the more a confident, wrong chatbot threatens to embarrass it. The companies with the most to lose behaved accordingly, which is to say cautiously. OpenAI had almost nothing to lose, a research preview being by definition deniable, and it shipped. The lesson the rest of the industry took from late 2022 had nothing to do with the quality of OpenAI’s model. It was that OpenAI had been willing to find out what the public would do with one, and the willingness was the scarce thing.

It is worth being precise about what changed and what did not, because the next four months would blur the two badly. The technology did not advance on November 30. The same GPT-3.5 that had been answering API calls in October was answering chat queries in December. What changed was the audience, and the audience changed everything downstream of it. A research field that had spent fifteen years measured in benchmark percentages and conference acceptances suddenly had a hundred million people forming opinions about it, most of them with no prior interest in machine learning and no patience for the qualifications that researchers attached to every claim. Teachers were drafting policies about it. Reporters who had never written about AI were filing daily. Investors who had treated the field as a curiosity were rereading their term sheets. The conversation about artificial intelligence stopped being a conversation among the people building it and became a conversation among everyone, conducted in public, at volume, in real time. That is the actual hinge of late 2022, and it was an arrival rather than an advance. The thing had been in the lab. Now it was in the world, and the world does not run on the lab’s schedule.

For the company that built it, the new attention was double-edged in a way that would take months to fully resolve. OpenAI had founded itself, in 2015, on a worry about the dangers of advanced AI and a promise to develop it cautiously and openly. Now it was running the most talked-about consumer product on earth, hemorrhaging money on compute, and learning in real time that the safest-seeming way to release a powerful system, quietly, as a humble preview, could produce a more violent public reaction than a thunderous launch ever would. The “research preview” framing had been sincere. It had also become, almost overnight, a fiction that no one could maintain, because a hundred million users do not behave like research subjects and a market does not treat a preview as a preview. Altman would spend the spring of 2023 trying to hold two postures at once, the proud chief executive of a runaway hit and the careful steward warning that the technology frightened even him.

That tension would have stayed an internal matter, a question of how OpenAI managed its own conscience, except that the rest of the industry was watching the same traffic numbers OpenAI was. A hundred million people in two months reads differently depending on where you sit. To a search company that has organized the world’s information for two decades, it is an alarm. To a software giant that has spent a decade looking for a way back into the conversation, it is an opening. The blank box on Eighteenth Street had done more than detonate a research field into a phenomenon. It had told every competitor with a balance sheet exactly how large the prize was, and roughly how little time they had to reach for it. Inside Google, where the warning lights had been blinking since December, someone had already used the words that would define the next phase of the story. They had called it a code red.