Zum Inhalt springen

Andrej Karpathy and Modern AI

Zusammenfassung

Andrej Karpathy is modern AI’s great explainer — and one of its most consequential practitioners. A Slovak-born immigrant kid who landed in Toronto at 15, he did his PhD under Fei-Fei Li at Stanford, co-created CS231n (the deep-learning course that trained a generation), became a founding member of OpenAI at 29, ran Tesla’s Autopilot AI for five years, returned to OpenAI, founded the AI-education startup Eureka Labs, and in 2026 joined Anthropic’s pretraining team. Along the way he wrote “Software 2.0,” personally served as the human baseline that ImageNet models famously surpassed, taught millions of people to build GPT from scratch on YouTube, and accidentally named an era by tweeting the phrase “vibe coding.”

From Bratislava to Hinton’s Classroom

Andrej Karpathy (born October 23, 1986, in Bratislava, then Czechoslovakia) emigrated with his family to Toronto at fifteen. At the University of Toronto (B.Sc. in computer science and physics, 2009) he took courses with Geoffrey Hinton — first contact with neural networks years before they were respectable. After a master’s at the University of British Columbia he joined Stanford, completing his PhD in 2015 under Fei-Fei Li on the intersection of computer vision and natural language: systems that generate sentence-length descriptions of images.

At Stanford he co-created and taught CS231n: Convolutional Neural Networks for Visual Recognition — the university’s first dedicated deep-learning course, launched in 2015 just as the ImageNet revolution made the subject the hottest in computer science. Enrollment roughly quintupled within two years, the lecture notes circulated worldwide, and CS231n became the de-facto on-ramp for the first post-AlexNet generation of practitioners.

His parallel career as a writer began in the same period. The May 2015 blog post “The Unreasonable Effectiveness of Recurrent Neural Networks” — with its char-rnn code generating fake Shakespeare, Wikipedia markup, and almost-compiling C — did more than perhaps any academic paper to make sequence models viscerally understandable (see Schmidhuber, Hochreiter, and LSTM for the architecture it showcased).

OpenAI, Founding Member

In December 2015, Karpathy became a founding research scientist at OpenAI (see Sam Altman and OpenAI), alongside Ilya Sutskever, Greg Brockman, and the rest of the original team. He worked on deep learning and generative models in the lab’s pre-GPT era — and left after a year and a half when Elon Musk recruited him for what looked like the hardest applied-vision problem in the world.

Tesla: Five Years of Autopilot

From June 2017 to July 2022, Karpathy was Tesla’s director of AI, leading the neural networks behind Autopilot and the Full Self-Driving program (see The Autonomous Vehicle Race and Elon Musk and Software-Defined Industry). His tenure defined Tesla’s contrarian technical identity: vision-only perception — culminating in the removal of radar from new cars in 2021 — multi-task “HydraNet” architectures serving dozens of predictions from shared backbones, and above all the data engine: a fleet of hundreds of thousands of customer cars used as a perpetual-motion machine for collecting and auto-labeling the rare cases where the network fails.

His most influential output of the Tesla years was conceptual: the November 2017 essay “Software 2.0”, arguing that neural networks are not a tool within programming but a new kind of programming — where the developer curates datasets and the optimizer “writes” the code, and where version control, testing, and IDEs all need reinventing for weights instead of source. The essay became a standard frame for understanding the ML-engineering discipline that followed.

The Educator at Scale

After leaving Tesla, Karpathy turned teaching into his primary output. The YouTube series “Neural Networks: Zero to Hero” (from 2022) builds everything from scratch in plain code: backpropagation via his 100-line autograd engine micrograd, language models via makemore, and a complete GPT in nanoGPT — including a 2-hour video in which he types a working GPT live. Millions watched; nanoGPT became the standard hackable baseline for LLM experimentation, later joined by llm.c, a GPT-2 training run written in pure C/CUDA.

He rejoined OpenAI from February 2023 to February 2024, then founded Eureka Labs (July 2024), a startup building an “AI-native” school whose first course, LLM101n, teaches students to build their own AI. In May 2026 he joined Anthropic’s pretraining team, to build a group using Claude to accelerate pretraining research itself — a closing of the loop he had been predicting for years: AI as the tool that builds AI.

Naming the Era: “Vibe Coding”

In February 2025, Karpathy tweeted about “a new kind of coding I call ‘vibe coding’, where you fully give in to the vibes, embrace exponentials, and forget that the code even exists” — describing himself talking to an LLM assistant and barely reading the diffs. The half-joking coinage escaped immediately, became the generic term for LLM-driven development, and was named Collins Dictionary Word of the Year 2025. It joined his earlier coinage “hallucination-adjacent” vocabulary in proving an unusual fact: Karpathy’s offhand phrases consistently become industry vocabulary, because an enormous fraction of the industry learned the field from him.

Fun Fact: He Personally Was the Human Baseline

The famous claim that ImageNet models “surpassed human performance” in 2015 has a specific human attached: Karpathy was the human. In 2014, wanting a real number for “human-level” error on ImageNet classification, he built himself a labeling interface and ground through hundreds of images per class option — concluding that a trained, motivated human (i.e., himself, after days of practice) achieves roughly 5.1% top-5 error. GoogLeNet stood at 6.7% at the time; he wrote that beating it had been “significantly harder than I first expected” and predicted his record would not stand long. Microsoft’s ResNet duly pushed machine error to 3.57% in 2015. Every “better than human” headline of that era traces back to one researcher patiently distinguishing 120 dog breeds.

📚 Sources