Jensen Huang and Nvidia: From Graphics to the Engine of AI

Zusammenfassung

Jensen Huang co-founded Nvidia in 1993 with a vision for specialized graphics processors, survived near-bankruptcy in the company’s early years, and built the GPU into the dominant computing platform of the twenty-first century. When deep learning researchers discovered that GPUs could train neural networks orders of magnitude faster than CPUs, Nvidia was ready — because Huang had made the bet, in 2006, to build CUDA, a general-purpose programming platform for GPU computing that almost no one wanted at the time. That bet turned a graphics company into the most valuable semiconductor firm in history and its founder into one of the most consequential engineers of his generation.

Taiwan, Kentucky, and the Road to Silicon Valley

Jen-Hsun Huang — known universally as Jensen — was born on February 17, 1963, in Tainan, Taiwan. When he was nine, his parents sent him and his brother to the United States to stay with relatives, anticipating a move that would let the boys learn English. Through a miscommunication, the relatives they were sent to live with in Kentucky were effectively strangers. For a year, Jensen and his brother attended a school in Oneida, Tennessee — a boarding institution that had historically served underprivileged rural students — before their parents arrived in the United States and the family settled in Portland, Oregon.

Huang has described the period matter-of-factly: it was unusual, he does not appear to have been traumatized by it, and he has said he arrived at American secondary school already independent in a way his peers were not. He attended Aloha High School in Beaverton, Oregon, became the Oregon state junior champion in table tennis, and enrolled at Oregon State University for electrical engineering. He then completed a master’s degree at Stanford.

His career began at Advanced Micro Devices (AMD), then at LSI Logic, where he worked on chip design. In January 1993, at thirty years old, he met with Chris Malachowsky and Curtis Priem — engineers he had known from the industry — at a Denny’s restaurant in San Jose, California. The three agreed to start a company focused on graphics. Huang’s account of the founding is specific: the Denny’s on Berryessa Road near Milpitas, a booth in the back. He has returned to it many times since.

NVIDIA: The Name and the Near-Death

The company was incorporated on April 5, 1993. The working name was NV — for “next version,” a somewhat sardonic insider joke about the perpetual promise of the next chip. When they needed a proper name, they searched for available three-letter NV combinations. NVIDIA — which also sounds like “envy” in Latin, invidia, an association Huang later noted approvingly — was not taken.

The first product, the NV1, shipped in 1995. It was a multimedia chip with a 3D graphics capability built around a quadratic texture mapping architecture — a technically unusual choice that was incompatible with the direction the industry was moving. Microsoft’s DirectX API standardized on triangular polygon rendering. The NV1 could not efficiently render triangles. The product failed.

The NV2 was cancelled. In 1996 Nvidia had approximately 100 employees, had burned through most of its funding, and faced the prospect of running out of cash before its next product shipped. The Japanese gaming company Sega had contracted Nvidia to produce a chip for the Sega Saturn successor; Nvidia accepted the contract partly to keep the lights on, delivered work that Sega found unsatisfactory, and used the Sega payment to fund an emergency pivot.

The pivot was the RIVA 128, a 3D graphics chip designed around the DirectX standard that shipped in August 1997. It was fast — significantly faster at 3D rendering than competing chips from 3dfx and S3 — and it sold. Within four months, Nvidia had shipped one million units. The company had survived by eighteen months of focused engineering under existential pressure, and the lesson — that speed of execution and willingness to abandon a failing architecture were more valuable than elegance — shaped Huang’s management philosophy permanently.

The GeForce and the GPU

The RIVA TNT (1998) and RIVA TNT2 (1999) continued Nvidia’s ascent in graphics, but the defining moment came with the GeForce 256 in August 1999. Nvidia called it the world’s first GPU — Graphics Processing Unit — coining a term that would eventually name an entire category of computing infrastructure.

The GeForce 256 was the first consumer chip to perform transform and lighting calculations in hardware rather than CPU-assisted software: geometry was transformed from 3D world space to 2D screen space, and lighting effects were computed, entirely on the graphics chip. The CPU was freed from these tasks. The result was dramatically richer 3D graphics at interactive frame rates, and the beginning of the architectural divergence between CPU and GPU that would define the next twenty-five years.

The partnership with Microsoft for DirectX compatibility — formalized through technical collaboration in the late 1990s — was strategically important: as DirectX became the standard API for Windows games, compatibility with Nvidia’s DirectX-certified hardware became a de facto requirement. Microsoft and Nvidia were not natural allies — their interests in the consumer space would later conflict — but the standards relationship created a mutual dependency that helped Nvidia dominate the gaming GPU market through the 2000s.

CUDA: The Bet Nobody Wanted

By 2006, Nvidia had a profitable business selling graphics cards to gamers. The GPU architecture had continued to evolve: more programmable, more flexible, able to run small programs (shaders) on thousands of execution units in parallel. Researchers in scientific computing had noticed that this parallel architecture could accelerate numerical simulations, but accessing the hardware required programming through graphics APIs — awkward workarounds that limited adoption.

Huang made the decision to build CUDA (Compute Unified Device Architecture): a programming model and software layer that allowed developers to write general-purpose code for Nvidia GPUs in a C-like language, without any graphics abstraction. CUDA launched in 2006 and was released publicly in 2007.

The bet was expensive and the early returns were modest. CUDA’s initial users were a small community of scientific computing researchers: fluid dynamics simulations, molecular dynamics, financial modeling. The gaming market — Nvidia’s core business — did not care. CUDA required Nvidia to devote significant engineering resources to non-gaming software that made no immediate contribution to revenue.

What CUDA did was ensure that when neural network researchers needed a platform, one existed. The GPU’s massively parallel architecture — thousands of simple cores operating simultaneously — mapped naturally onto the matrix multiplications that dominated deep learning computation. Without CUDA, those researchers would have needed to write shader code, which was technically feasible but effectively prohibitive.

Info

The architectural fit between GPUs and neural networks is not coincidental but neither was it designed. Neural network training requires multiplying large matrices repeatedly — each layer’s weights are a matrix, and the forward pass is a matrix multiplication. GPUs were designed to handle exactly this kind of operation efficiently: large arrays of simple arithmetic in parallel. The same hardware that rendered thousands of triangles per frame could multiply thousands of weight matrices per training step.

AlexNet and the Proof of Concept

In 2009, Andrew Ng and his students at Stanford began using CUDA-enabled GPUs to train neural networks, publishing results showing order-of-magnitude speedups over CPU training. The deep learning research community was still small, but this work began circulating through it.

In 2012, Alex Krizhevsky, Ilya Sutskever, and Geoffrey Hinton at the University of Toronto trained AlexNet on two NVIDIA GTX 580 graphics cards — consumer gaming hardware, not specialized research equipment — for about a week. The result, a top-5 error rate of 15.3% on the ImageNet benchmark against a second-place score of 26.2%, ended the debate about whether deep learning could work for real-world visual recognition. The specific mention of the GTX 580 in the paper was noticed inside Nvidia. The AI era had begun, and it was running on gaming cards.

See Fei-Fei Li and ImageNet for the dataset that made AlexNet possible, and Geoffrey Hinton and Deep Learning for the full context of the AlexNet result.

The Tesla Line and the Data Center Business

Following the AlexNet moment, Nvidia moved deliberately to capture the research and data center market it had inadvertently created. The Tesla line (named after Nikola Tesla, not the car company, which was founded the same year Nvidia launched the product line) had existed since 2006 as a professional scientific computing product. After 2012, it became a dedicated AI training platform.

The K40 (2013), K80 (2014), P100 (2016), and V100 (2017) GPUs became the standard infrastructure for AI training at Google, Amazon, Microsoft, and every major research institution. Each generation brought substantially more memory bandwidth and compute throughput; each generation sold at prices unimaginable in the consumer market — the V100 listed at approximately $10,000 per card.

The A100 (2020), built on the Ampere architecture, was designed specifically around AI workloads, with new hardware support for the mixed-precision arithmetic that modern neural networks use and a new interconnect (NVLink) for multi-GPU training. The H100 (2022), based on the Hopper architecture, went further: tensor cores optimized for transformer computations, HBM3 memory, and networking designed for clusters of thousands of cards.

The H100 became the single most sought-after piece of hardware in the technology industry almost immediately after ChatGPT’s launch in November 2022 demonstrated to the general public what large language model training had been building toward.

The Leather Jacket and the $1 Trillion Company

Jensen Huang has worn the same style of black leather jacket at virtually every public appearance since approximately 2000. He has given no particular explanation for this. It has become a symbol of sorts — not of Silicon Valley’s casual clothing culture (which is hoodies and fleece vests) but of something more specific: a personal style maintained with enough consistency to become identity. When Nvidia’s market capitalization crossed $1 trillion in June 2023 — making it the ninth company in history to reach that valuation — photographs showed Huang on stage in the jacket, holding an H100.

The trajectory from that point accelerated in ways that had no precedent in semiconductor history. Demand for H100s outpaced supply by factors of ten or more in 2023; AI companies reported waiting six to twelve months for GPU deliveries. Microsoft, Google, Amazon, and Meta collectively spent tens of billions of dollars on Nvidia hardware in 2023 and 2024. Nvidia’s quarterly revenue grew from approximately $6 billion before ChatGPT to over $22 billion by mid-2024.

In June 2024, Nvidia briefly overtook Microsoft and Apple to become the world’s most valuable company by market capitalization, at approximately $3.3 trillion. The moment was widely interpreted as a market statement about where economic value in computing had shifted: from software platforms to the infrastructure that made AI training possible.

Supply Chain and the Geopolitical Dimension

Nvidia’s rise created supply chain and geopolitical problems that Huang navigated with varying success. The company’s chips are designed in Santa Clara and manufactured by TSMC in Taiwan — the world’s most advanced foundry, which also produces chips for Apple, AMD, and others. See Morris Chang and TSMC for the broader context of foundry-model chip production.

The US government’s 2022 export controls on advanced AI chips to China — specifically targeting the A100 and H100 — put Nvidia in a difficult position. China had been a significant revenue source; the restrictions were designed to prevent Chinese AI development from leveraging US-designed chips. Nvidia’s response was to develop downgraded products (the A800 and H800) meeting the letter of the export control specifications, until updated rules closed those workarounds as well.

The AI chip shortage after ChatGPT also elevated competitive pressures. AMD began competing more aggressively in the data center GPU market. Google developed its own TPU (Tensor Processing Unit) chips. Amazon, Microsoft, and Meta all launched proprietary AI accelerator programs. These efforts have made progress, but Nvidia’s combination of hardware performance, the CUDA software ecosystem, and the accumulated expertise of a decade of AI researchers writing CUDA code has proven difficult to replicate quickly.

Warnung

Nvidia’s dominance rests not only on hardware but on software lock-in. CUDA code runs only on Nvidia hardware; the large body of research code, production frameworks, and optimized libraries written in CUDA represents years of accumulated work that switching to a competitor’s hardware would require rewriting. This is an intentional result of the original 2006 decision to build a proprietary programming platform rather than contribute to open standards.

The Meaning of the Bet

Jensen Huang has described the CUDA decision as the most important bet Nvidia ever made, and the claim is defensible. A company that had been saved from bankruptcy by fast execution on a graphics chip in 1997 spent years subsidizing a general-purpose computing platform that most GPU purchasers never used — until the platform became the foundational infrastructure of the most transformative technology development of the 2010s and 2020s.

The story is not simply one of prescient foresight. Huang did not predict deep learning in 2006; he built a platform that researchers found useful for multiple scientific applications, and the deep learning application proved to be the important one. The lesson is something more like: investing in general capability, before the specific applications are known, can pay off in ways that targeted investment cannot.

His company is now the critical chokepoint in the supply chain of artificial intelligence — every major AI system in the world, from OpenAI’s GPT models to DeepMind’s AlphaFold, runs on Nvidia hardware or hardware designed to compete with it. That position was not inevitable. It was built by a teenager from Taiwan who learned table tennis with a paddle and engineering from a graphics card that almost destroyed his company.