Part IX · Chapter 38

Open Versus Closed

Mistral carries Europe's flag, the policy fight over whether weights are too dangerous to release reaches Washington, and Chinese open models begin to lead. → Why "open" became the most contested word in AI.

“I personally think we have been on the wrong side of history here and need to figure out a different open source strategy.” — Sam Altman, Reddit AMA, January 31, 2025

On September 27, 2023, a startup that was four months old shipped a frontier-grade language model to the world by posting a link to a torrent.

There was no demo, no launch event, no carefully staged keynote. Mistral, a company founded that spring in Paris by three researchers who had walked out of the two most prestigious AI labs on earth, put a BitTorrent magnet link on X and let people work out the rest. The file at the other end of it was Mistral 7B, a seven-billion-parameter model released under the Apache 2.0 license, free for anyone to download, run, modify, or sell, no application, no API key, no permission asked. A blog post followed, but the magnet link was the message, and everyone who saw it understood the reference. Seven months earlier, Meta’s LLaMA weights had escaped onto exactly this kind of link after a leak the company never intended, and the whole industry had watched a controlled research release become an uncontrollable fact of the internet in about a week. Mistral was doing on purpose, as a brand, what had happened to Meta by accident. The leak was now the marketing.

Arthur Mensch, Mistral’s chief executive, had been at Google DeepMind. His co-founders Timothée Lacroix and Guillaume Lample came from Meta’s FAIR lab, where they had been authors on the original LLaMA paper. They had helped build the model that escaped, and they intended to build the next one in the open from the start, no fence to fail. The magnet link was a statement of identity, and the identity was simple: Mistral would be the company that did the one thing OpenAI would never do.

That a company so young could make such a gesture and have it land said something about how fast the ground had shifted. In the year since ChatGPT, “open” had become the most contested word in artificial intelligence, and the strangest thing about the fight was who had ended up on which side. The most aggressive patron of open AI turned out to be Meta, the company that had spent the prior decade as the public face of closed, extractive, surveillance-funded technology. The man who had hoarded the world’s social graph, Mark Zuckerberg, became for a few years the field’s loudest advocate for giving the crown jewels away. He had not started there. He had been pushed there, by the same leak that Mistral was now imitating, when his lawyers’ careful research release turned into a torrent and Zuckerberg drew the opposite lesson from the one a cautious lawyer would have drawn. The escaped weights had bred an ecosystem of unpaid developers who improved the model, debugged it, and ported it to run on a laptop, all for free. So Meta stopped fighting the openness and embraced it. The next model would be open on purpose.

To understand why, it helps to be precise about what people were actually arguing over, because “open versus closed” was never one fight. It was three, braided together so tightly that the participants often did not notice they were answering different questions.

The first fight was about safety. The camp that worried most about catastrophe held a simple, hard-to-refute point: a released weight file can never be recalled. A closed model behind an API can be patched, rate-limited, monitored, and shut off. An open one is loose forever, and whatever safety training it shipped with can be fine-tuned away in an afternoon by anyone with a few good GPUs. Geoffrey Hinton, by then the field’s most famous defector to the safety side, framed open-sourcing frontier models as roughly equivalent to selling fissile material in hardware stores. Yoshua Bengio said much the same in calmer language. The fear was uplift: that an open frontier model would hand a motivated amateur the missing competence to build a weapon, biological or cyber, that they could not have built alone.

The second fight had nothing to do with catastrophe. It was about power. Even if open models were perfectly safe, closing the frontier would hand a permanent advantage to three or four American companies, and lock everyone else into renting intelligence from them forever, on their terms, at their prices, subject to their content policies and their government’s export rules. Startups would be tenants. Universities would be tenants. Other countries would be tenants. This was the argument Zuckerberg leaned on hardest, and it was the one that traveled best internationally, because it reframed openness as sovereignty. If you were a European founder or a Chinese lab or an Indian ministry, the safety debate was an American luxury and the power debate was your actual problem.

The third fight was about a word. What did “open” even mean? Meta called LLaMA’s successors “open source.” The Open Source Initiative, the small nonprofit that had spent twenty-five years guarding that exact term for software, said flatly that they were not. The license restricted commercial use above a user threshold, carved out whole jurisdictions, and Meta disclosed essentially nothing about the training data. None of that resembled the freedoms the phrase had always promised. Most engineers sidestepped the quarrel with a more honest, less glamorous term: open weights. You could download the model and run it. Whether you were legally and philosophically “free” was a separate matter, and a contested one.

The contest mattered because the people fighting over the word were fighting over legitimacy. And the next two years would put each of these three fights through a stress test that nobody had scripted.

The first stress test of all three came from Paris, where the magnet link had already announced itself.

Mistral was a referendum on how badly Europe wanted an AI champion of its own, and the investors had answered before the company shipped anything. In June 2023, three months before the Mistral 7B torrent, Mistral raised a seed round of 105 million euros, roughly 113 million dollars, led by Lightspeed, at a valuation around 260 million dollars. It was one of the largest seed rounds in European history, raised on a slide deck and the founders’ resumes. The investors were not buying a product. They were buying the idea that the open frontier could have a European address.

The magnet link, when it came that September, was the company collecting on that bet in public. A four-month-old startup had branded itself as the anti-OpenAI by giving its weights to anyone who wanted them. In December 2023 Mistral repeated the gesture with Mixtral 8x7B, a sparse mixture-of-experts model (only part of the model runs per query, a design Chapter 39 explains in full), again via magnet link, and closed a Series A of 415 million dollars at a roughly two-billion-dollar valuation led by Andreessen Horowitz. The torrent had become a corporate strategy worth a billion dollars a quarter. Mistral, named for the cold wind that runs down the Rhône valley, had made openness its entire pitch.

Underneath the theater, though, Mistral faced the same arithmetic everyone faced. Training frontier models costs money that giving them away does not return. By February 2024 Mistral had released a closed flagship, Mistral Large, available only through an API, and signed a distribution deal with Microsoft worth around fifteen million euros, which promptly drew EU antitrust attention and a chorus of accusations that the open champion had quietly gone closed. The company that had made magnet links a manifesto now had a paid tier it did not torrent. This was not hypocrisy so much as physics. The open-weights pose was expensive, and at some point someone had to pay for the GPUs.

While Mistral negotiated its way between the poses, Zuckerberg made his the loudest in the industry. On July 23, 2024, Meta released Llama 3.1, whose 405-billion-parameter version was the first openly downloadable model to stand credibly level with the best closed systems. The spec sheet and the training run are Chapter 37’s. The same day, Zuckerberg published his “Open Source AI Is the Path Forward” letter, with its Linux-beats-Unix argument and its swipe at rivals trying to “create God,” which Chapter 37 quotes in full. It was a good line, and it papered over the awkward fact that the same company would soon discover limits to its own evangelism.

For a year, the open camp won the arguments that could be won with words and policy. Six days after the Llama 3.1 letter, on July 30, 2024, the U.S. Commerce Department’s NTIA delivered a report it had been ordered to produce on “dual-use foundation models with widely available weights,” which was Washington’s careful way of asking whether open frontier models should be restricted or banned outright. The report’s answer was, in effect, not yet. It found no demonstrated marginal risk severe enough to justify restriction, and recommended that the government build the capacity to monitor the technology rather than try to lock it down. The safety camp’s central fear, catastrophic misuse, had not yet produced a catastrophe anyone could point to, and the NTIA declined to legislate against a hypothetical.

Two months later the open camp won the bigger political fight, in California, in the SB 1047 fight that Chapter 30 tells in full. To the bill’s authors it was a modest guardrail aimed only at the giants. To the open-source community it was an existential threat, because liability that follows a model is fatal to a model you have released into the wild and can no longer control. Yann LeCun, Meta’s chief AI scientist and the most relentless anti-doom voice in the field, mobilized against it, framing the bill as a kill switch for open AI written by people who had confused science fiction with risk assessment. When Governor Newsom vetoed it, the open camp had beaten back regulation by arguing, persuasively, that regulation would crush the open alternative and entrench the closed incumbents it claimed to fear.

LeCun’s position underneath all of this was more radical than tactical. He did not merely think open weights were safe enough to release. He thought the entire panic was misdirected, because in his view the large language models everyone was frightened of were not on the road to dangerous intelligence at all. They had no understanding of the physical world, no persistent memory, no capacity to plan, and fearing imminent extinction from them was, in the deflation he had been repeating for years, like fearing overpopulation on Mars. The future, he argued, lay in models that learned how the world worked from video and interaction, not in scaling up text predictors. It was a genuine intellectual disagreement with Hinton and Bengio, his fellow Turing laureates, and it ran straight down the middle of the open-versus-closed divide, because if the models were not dangerous then the safety case for closing them collapsed.

So as 2024 ended, the scoreboard read like a rout. Open weights had a European champion raising billions, an American giant championing it on principle, a federal report blessing it, and a marquee regulatory defeat for its opponents. The settled wisdom held that open models would always trail the closed frontier by a year or so, useful and cheap and a permanent step behind, the generic to the brand name. Then, in the third week of January 2025, that wisdom died in a weekend.

The company that killed it was not supposed to exist.

DeepSeek had been founded in July 2023, in Hangzhou, as a side project of a quantitative hedge fund called High-Flyer, run by a man named Liang Wenfeng. Liang was not a Silicon Valley type. He was a quant who had made his money on automated trading and who had, somewhat eccentrically, used his fund’s profits to stockpile thousands of Nvidia GPUs before the U.S. export controls of 2022 cut China off from the best chips. High-Flyer had reportedly amassed something on the order of ten thousand A100s while they were still buyable, and Liang treated the resulting compute cluster less like a business asset than like a telescope, an instrument for doing fundamental research because he found it interesting. DeepSeek hired young, hired domestically, published its work, and pursued efficiency with the intensity of people who could not simply buy their way to scale because the best hardware was embargoed.

In December 2024, DeepSeek released V3, a mixture-of-experts model with 671 billion total parameters and 37 billion active at any moment. The number that detonated was in the paper: DeepSeek reported a training run that had cost about 5.6 million dollars in compute, on roughly two thousand of Nvidia’s H800 chips, the deliberately throttled, export-compliant version of the H100. Against the ten-figure budgets the American labs were rumored to be spending, 5.6 million dollars sounded like a rounding error, like someone had built a frontier model in a garage.

It is worth being careful with that number, because it became a weapon, and weapons are not accounting. The 5.6 million was the headline; the real program cost vastly more, and the full reckoning belongs to Chapter 39. In the panic that followed, almost no one made the distinction.

On January 20, 2025, DeepSeek released R1, a reasoning model built on top of V3 that produced long chains of internal deliberation before answering, in the style OpenAI had pioneered with o1 a few months earlier. R1 was competitive with o1 on hard math and coding benchmarks. And DeepSeek released it under an MIT license, about as permissive as licenses get, with the weights and a detailed paper, free for anyone on earth to download, run, modify, and sell. This was the thing the settled wisdom had said could not happen: a frontier-class reasoning model, openly licensed, matching the best closed system, shipped from China within months rather than years of the model it rivaled.

For a week the financial markets did not notice. Then the DeepSeek app climbed to number one on Apple’s U.S. App Store, ahead of ChatGPT, and the implication finally landed on Wall Street. If a Chinese lab under export controls could build a model this good for a fraction of the assumed cost and give it away, then the central bet of the entire American AI economy, that the path to dominance ran through buying ever more Nvidia chips, looked suddenly fragile. On Monday, January 27, 2025, the realization detonated the largest one-day loss of market value in the history of the U.S. stock market (the crash Chapter 40 narrates in full), and it happened because a hedge fund’s research hobby had posted a file to the internet.

The reactions arrived in the register of national emergency. Marc Andreessen, who had spent the prior year fighting AI regulation, called R1 “AI’s Sputnik moment,” invoking the 1957 panic when a Soviet satellite proved American technological supremacy was not a law of nature. President Trump, days into his second term, said DeepSeek “should be a wake-up call for our industries.” Alibaba, not wanting to cede the China narrative to a hedge fund, rushed out an upgraded model, Qwen2.5-Max, within a day or two, claiming wins over DeepSeek’s own V3. The open frontier, which everyone had assumed would be American if it existed at all, had just announced that it was Chinese, and the whole American apparatus reacted as if a strategic asset had been lost.

Four days after the crash, on January 31, 2025, Sam Altman sat for a Reddit AMA. OpenAI, the company that had put the word “open” in its name in 2015 and then closed up so thoroughly that even its model architectures became secrets, was now being lapped on openness by a Chinese startup. Asked about it, Altman conceded the point with unusual bluntness. OpenAI, he said, had been “on the wrong side of history” on open source and needed a different strategy. It was a remarkable admission from the man who, more than anyone, had defined the closed approach and reaped its rewards. The reversal he promised took six months. On August 5, 2025, OpenAI released gpt-oss-120b and gpt-oss-20b under the Apache 2.0 license, its first open-weight language models since GPT-2 in 2019. They were good, and they were also pointedly not OpenAI’s frontier, mid-tier models offered as a peace gesture rather than the crown jewels. OpenAI had rejoined the open camp on its own carefully limited terms.

DeepSeek had rearranged the board, but it had not made open weights safe or simple for the companies that championed them. Meta learned that the hard way in April 2025.

On Saturday, April 5, Meta released Llama 4, a new family built on a mixture-of-experts design, with a small model called Scout, a midsize one called Maverick, and a still-training behemoth literally named Behemoth. The headline specs were extraordinary, including a claimed ten-million-token context window. But the launch was swallowed by a controversy over a benchmark. Meta had submitted a version of Maverick to LMArena, the popular crowd-voted leaderboard where humans compare model outputs blind, and it had done very well. Then researchers noticed that the Maverick on the leaderboard was an “experimental” variant tuned for the kind of chatty, agreeable answers human raters reward, and that it was not the same as the weights Meta had actually released. The model topping the chart was not the model you could download.

It was a small deception with a large meaning, because the entire moral claim of open weights was that you could verify everything yourself, that there were no hidden tricks because the artifact was in your hands. Meta’s head of generative AI, Ahmad Al-Dahle, pushed back that the company “would never train on test sets” and blamed implementation bugs for the discrepancy between the benchmark and the release. But the episode dented the brand Meta had spent two years building. The largest model, Behemoth, was later delayed, and Meta added new license restrictions on multimodal use in the EU. The open champion was looking less like a movement and more like a marketing department.

By the summer, the doctrine itself was wavering. On July 30, 2025, Zuckerberg published his “Personal Superintelligence” memo, quoted in full in Chapter 37, which hedged on whether Meta’s most powerful models should be open at all. From the man who had called openness the path forward, the retreat was unmistakable. Around the same time Meta formed a new unit, Meta Superintelligence Labs, and put it under Alexandr Wang, the founder of the data-labeling company Scale AI, in a deal whose details belong to Chapter 41. Wang was not an open-source partisan. He was a symbol of Meta buying its way toward the closed frontier, and his arrival read, to people inside and outside the company, as the doctrine’s quiet funeral.

The clearest sign of the shift was the man it pushed out. Yann LeCun had been the intellectual conscience of Meta’s open program, the Turing laureate who insisted both that the models were not dangerous and that openness was the only honest way to build them. As Meta reorganized around superintelligence and caution, he left to start his own company around the world-models research he had championed all along, a departure Chapter 41 follows. The symbolism was hard to miss: the loudest voice for open AI inside the biggest patron of open AI was on his way out, just as that patron grew cautious about the thing he believed in.

LeCun was not the only true believer the era chewed up. Stability AI had been the open counterpoint on the image side, the company that in August 2022 released Stable Diffusion as freely downloadable weights, the open answer to OpenAI’s locked-down DALL-E 2. Its founder, Emad Mostaque, had raised money at a billion-dollar valuation on pure open-release evangelism. But the company never found a way to turn free weights into a durable business, got buried under copyright lawsuits from Getty Images and a class of artists, and shipped a flagship, Stable Diffusion 3, in June 2024 that was so badly mangled at rendering human anatomy that the community mocked it openly and some platforms considered banning it. Mostaque resigned in March 2024, saying he intended to pursue decentralized AI. The arc was a warning the whole open camp would eventually study: being first and being free was not, by itself, a way to survive.

What endured was quieter and more structural. Through all of it, the place where open models actually lived was Hugging Face, a company founded by three French engineers, Clément Delangue, Julien Chaumond, and Thomas Wolf, that had started as a chatbot and pivoted into becoming the neutral commons of machine learning. Its Hub hosted the weights, the code, and the datasets for nearly every open model anyone shipped, from EleutherAI’s grassroots early efforts through Mistral and Llama and Qwen and DeepSeek, well past a million models by 2024. Hugging Face raised a 235-million-dollar round at a 4.5-billion-dollar valuation in August 2023, backed by nearly every large company in the field at once, which was telling: Google, Amazon, Nvidia, AMD, Intel, IBM, Qualcomm, and Salesforce all wanted a stake in the Switzerland of open AI. Whatever the labs decided about open versus closed, the commons they uploaded to was a business worth billions.

And the definitional fight, the one over the word, got its formal verdict. On October 28, 2024, the Open Source Initiative published the Open Source AI Definition 1.0, the result of a long, contentious process led by its executive director, Stefano Maffulli. The definition required not only the freedoms to use, study, modify, and share a model, but enough information about its training data that someone could in principle recreate it. By that standard, Meta’s Llama, with its restrictive license and its undisclosed training corpus, did not qualify as open source, no matter what Meta called it. Maffulli had spent the prior year refusing to let the term be redefined by the company with the biggest marketing budget, and the 1.0 definition was his line in the sand. It changed nothing about what engineers downloaded. It changed everything about who got to keep the word.

By the spring of 2026, the map looked nothing like the one anyone had drawn in 2023. The openly available frontier was largely Chinese, carried by DeepSeek and by Alibaba’s Qwen, which had quietly become the most fine-tuned and derived model family on Hugging Face, the base that thousands of other models were built on top of. Mistral had survived by becoming a sovereignty play: in September 2025 it raised a 1.7-billion-euro round led by ASML, the Dutch company that makes the extreme-ultraviolet lithography machines without which no advanced chip can be manufactured, the single most strategically important company in Europe. ASML became Mistral’s largest shareholder, at a post-money valuation around 11.7 billion euros, and the message was explicit. Open weights were no longer a hacker’s preference or a researcher’s ethic. They were how a continent or a country avoided being a tenant of American AI.

That was the real arc, and it had inverted while everyone was arguing about safety. The question that had started as a moral one among researchers, whether intelligence should be a public good or a guarded asset, had become a question of national strategy, answered differently by every power according to its position. The United States, holding the closed frontier and the chips, drifted toward caution and control even as its own pioneers conceded they had been on the wrong side of history. Europe bought a champion to keep from being captured. China, locked out of the best hardware, made openness a weapon precisely because it could not win the closed game, and discovered that giving the frontier away was a way to set the terms for everyone else.

Nobody had decided this. There was no summit where the world agreed that the open frontier would be Chinese and the closed one American, no vote, no treaty. It happened the way the LLaMA leak had happened, through a thousand small choices by people optimizing for their own position, until one Monday in January a file on the internet wiped out more market value in a day than most countries produce in a year, and the argument that had been about ethics turned out, all along, to have been about power.