Part VIII · Chapter 35

The Trillion-Dollar Engine

Nvidia becomes the most valuable company on earth by selling the one thing every lab needs, and a Dutch-Taiwanese-Korean supply chain becomes the binding constraint on intelligence. → How compute became the scarce resource of the age.

“I would describe the dinner as me and Elon begging Jensen for GPUs. Please take our money. No, no, take more of it. You’re not taking enough of it, we need you to take more of it, please.” — Larry Ellison, Oracle financial analyst meeting, September 2024

In September 2024, at a financial analyst event that was supposed to be about Oracle’s cloud business, Larry Ellison told Wall Street a story about begging. Oracle’s co-founder was eighty years old and worth roughly two hundred billion dollars, and he wanted the analysts to understand what it now took to get the one component the entire artificial-intelligence industry was built on. He and Elon Musk, he said, had taken Jensen Huang to dinner at a Nobu in Palo Alto. The two of them, among the richest men alive, had spent the evening pleading with the chief executive of Nvidia to sell them more chips. Please take our money, Ellison recalled telling Huang. You’re not taking enough of it.

The scene was meant to be a little funny, the way a billionaire complaining about scarcity is always a little funny. But it described something real. By the time Ellison told that story, the binding constraint on the most consequential technology of the decade was not talent, and it was not ideas, and it was not even money in the ordinary sense. It was a single rectangle of silicon, about the size of a postage stamp, that only one company could make at scale. Everyone who wanted to build a frontier model needed it. Almost no one could get enough of it. And the man who controlled the supply had become, by some measures, the most powerful person in the industry without writing a line of model code.

Huang had not arrived at that position by accident, and he had not arrived at it recently. The story of how Nvidia became the engine of the AI boom runs back nearly two decades, to a bet that for most of its life looked like a mistake.

Nvidia had been founded in 1993 to make graphics chips for video games, and for years that was what it was: a competent, mid-sized maker of the cards that rendered explosions and racetracks on gaming PCs. A graphics processor lived or died by the same parallel-arithmetic property that had let two gaming cards train AlexNet: it did thousands of identical calculations at once. Drawing a screen full of pixels was exactly that kind of problem. So, it would turn out, was multiplying the enormous matrices of numbers at the heart of a neural network.

In 2006 and 2007 Huang made a decision that puzzled his own investors. He spent heavily to build a software platform called CUDA that would let programmers use Nvidia’s graphics chips for general computation, not just graphics. There was, at the time, almost no demand for this. The customers were a scattering of physicists and oil-and-gas modelers and a handful of academics who wanted cheap parallel horsepower. Nvidia poured money into CUDA year after year while the payoff stayed theoretical. Huang later described it as a bet on a market that did not yet exist.

Inside Nvidia the project had its true believers, but it was a hard thing to defend to people who counted quarters. The company embedded CUDA support across its product line, which raised the cost of every chip it sold to gamers who would never use it, and it staffed a software effort whose customers were a rounding error. Wall Street wanted to know why a graphics company was acting like a supercomputing company. The answer Huang gave, over and over, was that the workloads of the future would be parallel, that the world would eventually need to do a great deal of the same arithmetic at once, and that when it did, the software would already be there. It was a decade-long act of faith that the bill came due far enough in the future that most observers had stopped watching.

The bet paid off in a way no one had quite predicted. When Geoffrey Hinton’s students trained their neural network on two gaming GPUs in a Toronto bedroom and crushed the ImageNet image-recognition contest in 2012, they did it on Nvidia hardware, using CUDA. They had not chosen Nvidia for any grand reason; the cards were cheap, available at a consumer electronics store, and CUDA let them program the thing without a doctorate in graphics. That was the whole point of the platform, and it was the moment Huang’s slow bet quietly became the foundation of a field. The deep-learning frameworks that followed, including PyTorch and TensorFlow, were written to target CUDA first. A generation of researchers learned to think in its terms. Nvidia’s libraries for the specific math of neural networks, cuDNN and the rest, became the assumed substrate. By the early 2020s this had hardened into the deepest moat in technology. A competitor could design a faster chip, and several tried, but a faster chip that could not run the software the entire field already used was a curiosity. AMD had its ROCm software stack, Google had built its own Tensor Processing Units, Amazon had Trainium, Microsoft had Maia, Meta had MTIA. Each was real, and each ran into the same wall: the world’s models, tools, and engineers had grown up speaking CUDA, and switching cost time that no one in a race could spare.

Then came the chip that turned the moat into a fortune. At Nvidia’s developer conference in March 2022, Huang announced the Hopper architecture and its flagship, the H100. It carried eighty billion transistors, was manufactured by TSMC on a custom process, and included a feature called the Transformer Engine, a piece of silicon designed specifically to accelerate the architecture that, since 2017, every large language model had been built on. It was a chip purpose-built for the exact workload that was about to consume the world.

Eight months later, ChatGPT arrived, and demand for the H100 stopped being an engineering question and became a brawl. Street prices for a single H100 ran to thirty or forty thousand dollars, and higher on the secondary market, with lead times stretching past half a year. The chip became a kind of currency. Startups raised money partly to buy GPUs; cloud providers advertised how many they had; a company’s seriousness could be measured in its allocation. Founders described phoning Nvidia not to negotiate price but to secure a place in line, and venture capitalists began treating a confirmed GPU allocation as a more meaningful asset than a term sheet. A startup with money but no chips could do nothing; a startup with chips could always raise more money. The ordinary logic of capitalism, in which a buyer with cash sets the terms, had been turned inside out by a shortage of one component. In August 2023 the analyst Dylan Patel of SemiAnalysis gave the divide its name. There were the GPU rich, the labs with tens of thousands of H100s who could train frontier models, and there were the GPU poor, everyone else. It was a brutal framing, and it stuck, because it was true. Intelligence, at the frontier, had become a function of how many of these chips you could assemble in one place and keep fed with electricity.

The market understood what was happening before most of the public did. On May 30, 2023, Nvidia’s stock spiked after a blowout forecast and the company briefly touched a one-trillion-dollar market capitalization, the first chipmaker ever to do so. It had taken Nvidia about twenty-four years as a public company to reach that line. It would take roughly a year to triple it.

The financial results that drove this were not normal. For the fiscal year that ended in late January 2024, Nvidia reported about $61 billion in revenue. In the single quarter that closed that fiscal year, total revenue rose 265 percent from the year before, and data-center revenue rose 409 percent. A year later, for the fiscal year ending in late January 2025, total revenue had more than doubled again to roughly $130 billion, with data-center sales of about $115 billion and gross margins above 75 percent. These are not the margins of a hardware company. They are the margins of a company selling something no one else can supply, to customers who cannot afford to wait.

The buyers obliged the framing with their order books. In January 2024 Mark Zuckerberg said Meta expected to have roughly 350,000 H100s by the end of that year, and the equivalent of about 600,000 of Nvidia’s chips in total computing power. At street prices that was something on the order of ten billion dollars in silicon, from a single company, in a single year, to train models Meta intended to give away. Microsoft, Google, Amazon, and a lengthening line of well-funded labs were placing orders of similar ambition. Each of those orders flowed, at the bottom, to the same supplier.

In June 2024 Nvidia passed Apple to become the second-most-valuable company in the world, and for a moment on June 18 it edged past Microsoft to be the most valuable company on earth, with a market capitalization above three trillion dollars. The position would change hands several times over the following year as the three giants traded places, but the direction was clear. A company that made the picks and shovels of the AI rush had become more valuable than the companies doing the digging.

Huang had a line he repeated to customers who blanched at the prices, a piece of salesmanship that doubled as a thesis: the more you buy, the more you save. The claim was that a buyer who loaded up on Nvidia’s newest systems would get so much more computation per watt and per dollar that the larger purchase was actually the thrifty one. It was the kind of thing that sounded absurd until you ran the numbers on training a model, at which point a surprising number of chief financial officers nodded along and signed.

The next generation made the case literal. In March 2024 Huang unveiled Blackwell, named for the statistician David Blackwell. Its top chip, the B200, fused two pieces of silicon into one and carried over two hundred billion transistors. But the more important shift was that Nvidia had stopped selling chips and started selling rooms. The GB200 NVL72 packed seventy-two Blackwell GPUs and thirty-six of Nvidia’s own Grace CPUs into a single liquid-cooled rack, wired together so tightly that the rack behaved like one enormous processor. Huang claimed up to thirty times the inference performance of the H100 generation. The unit of sale had become the rack, and soon it would be the whole data center.

There was a strangeness to the economics that even Huang acknowledged on stage. A single Blackwell rack cost millions of dollars, drew enough electricity to power a small neighborhood, and had to be plumbed with liquid coolant because air could no longer carry away the heat. And yet the labs lined up to buy them by the thousand, because the alternative was to fall behind in a race where being a generation late on hardware meant training a worse model more slowly than a competitor. The chips depreciated fast, were obsolete within a few years, and were nonetheless the most coveted capital equipment on the planet. Buyers were spending fortunes on assets that would be scrap before the loans that financed them were repaid, and they were doing it willingly, because the model you could train this year was worth more than the money you saved by waiting.

Blackwell’s ramp became, by Nvidia’s own description, the fastest product launch in its history. In the single quarter ending in late January 2025, the new architecture generated about $11 billion in revenue. The market kept revaluing the company upward in step. On July 9, 2025, Nvidia became the first company ever to reach a four-trillion-dollar market capitalization. Three and a half months later, on October 29, 2025, it crossed a line no firm had ever touched: five trillion. Huang told an audience at the company’s conference in Washington that week that Nvidia held something on the order of half a trillion dollars in cumulative bookings for Blackwell and its coming successor, the Vera Rubin platform, through 2026. Two and a half years after first brushing a trillion, Nvidia was worth five.

What made this more than a story about one lucky chip company was the chain of dependence behind every H100 and every Blackwell rack. Nvidia did not, in fact, make its own chips. It designed them and handed the designs to others, and the others sat at choke points even narrower than Nvidia’s own.

Every advanced Nvidia processor was fabricated by Taiwan Semiconductor Manufacturing Company, on an island that China claimed as its own and that the United States had quietly decided it could not afford to lose. TSMC’s most advanced fabs, in turn, could not pattern their tiniest features without extreme-ultraviolet lithography machines, and those machines were made by exactly one company on the planet: ASML, in the Dutch town of Veldhoven. A single EUV system cost on the order of a couple hundred million dollars, weighed as much as a couple of buses, shipped in dozens of crates, and contained a light source that vaporized droplets of molten tin tens of thousands of times a second to produce light at a wavelength small enough to etch transistors a few nanometers across. There was no second supplier. If ASML stopped, the leading edge of the entire industry stopped with it.

And a modern AI chip was useless without memory fast enough to feed it. The H100 and Blackwell relied on high-bandwidth memory, stacks of DRAM bonded directly beside the processor, and the supply of that memory was dominated by a small number of firms, with South Korea’s SK Hynix the leading provider of the most advanced grades through this period, alongside Samsung and the American firm Micron. So the true engine of the AI boom was not American at all in any simple sense. It was a Dutch machine etching a Taiwanese wafer wrapped in Korean memory, designed in California, the whole assembly threaded through a handful of companies that could each, by going dark, halt the others. The most valuable supply chain on earth ran through perhaps four countries and could be drawn on the back of a napkin.

This concentration had become an instrument of statecraft as much as a commercial fact. Beginning in October 2022, the United States restricted the export of the most capable AI chips to China, reasoning that the same hardware training chatbots could train systems with military uses, and Nvidia took to designing deliberately throttled chips for the Chinese market. The full mechanism of those controls, and the loopholes that followed, is Chapter 39’s story. What matters here is the pressure they created: they sent Chinese labs hunting for hoarded supply and pushed them toward squeezing more out of less, a consequence that would arrive in full only later, when a hedge-fund-funded lab in Hangzhou showed how much a GPU-poor team could do with what it had managed to stockpile.

For all the swagger, there was a current of unease running beneath the boom, and Huang’s customers were the ones who felt it. The chips were extraordinary and the demand was real, but the prices and the dependence troubled even the people writing the checks. Ellison’s dinner-table story was funny because it inverted the usual order of things. Oracle had spent decades as the vendor that customers feared, the one whose license audits made enterprises sweat. Now Ellison was the supplicant, and the man across the table held the only thing he could not buy elsewhere at any price.

Nvidia, for its part, had begun to do something that drew more scrutiny than its margins. It was not content to sell chips to the labs; it began investing in them, taking stakes in the very companies whose purchases drove its revenue. Money flowed out from Nvidia and came back as orders for Nvidia hardware, a loop that looked to admirers like a confident bet on the future of its own customers and to skeptics like a way of manufacturing the demand it then reported. Whether that loop was a flywheel or a snake eating its tail was a question that would dominate the next phase of the story, as the figures stopped being measured in chips and started being measured in gigawatts and in commitments that dwarfed any company’s revenue.

That phase belonged to the financiers and the power companies. But the shape of it had been set on the silicon. A single American company, on the back of a software bet most of its own board had once doubted, had made itself the toll booth of the age, and a chain of firms most people had never heard of had become the thing that the entire pursuit of machine intelligence now rested on. The scarcest resource in the world had become the ability to do an enormous amount of a very specific kind of arithmetic, ahead of information and even talent. Only one company, riding a supply chain four countries long, could sell that at scale, and everyone else, including the richest men alive, was reduced to asking nicely.