The Autonomous Vehicle Race: The Longest Last Mile in Technological History

Zusammenfassung

The story of self-driving cars is one of genuine, astonishing progress followed by a humbling collision with reality. In 2004, no robot vehicle could navigate a 142-mile desert trail. A decade later, the same technology had driven millions of miles on public roads. By 2015, executives at every major automotive and technology company were promising fully autonomous vehicles by 2020. By 2024, that deadline had quietly expired, and the hardest technical problem in consumer technology remained unsolved. This article traces how a DARPA competition launched a revolution — and why the last ten percent of the problem has proven harder than the first ninety.

The Desert That Humiliated Everyone: DARPA Grand Challenge 2004

On March 13, 2004, fifteen robot vehicles lined up at the start line in Barstow, California. The prize was one million dollars. The task was to navigate 142 miles of Mojave Desert terrain — dirt roads, dry lakebeds, mountain passes — without any human intervention. DARPA, the U.S. Defense Advanced Research Projects Agency, had organized the race as a shot of adrenaline for autonomous vehicle research: the military wanted vehicles that could supply frontline troops without risking drivers’ lives.

None of the vehicles finished.

The winner, if the word can be used, was Carnegie Mellon University’s Sandstorm — a modified Humvee that traveled 7.4 miles before driving into a berm, lodging itself, and spinning its wheels until its tires caught fire. The field performed in a manner that, to observers expecting a glimpse of the future, looked more like a demolition derby in slow motion. Vehicles drove off the course, got stuck in culverts, collided with their own support structures, or simply stopped moving and sat, confused, in the desert. DARPA director Tony Tether called the results “disappointing.” The Los Angeles Times called them “a fiasco.”

The fiasco mattered. The public failure revealed precisely which technical problems remained unsolved: sensor fusion, real-time terrain mapping, path planning under uncertainty, and the basic challenge of building systems that behaved reliably when conditions deviated from their design parameters. Every team returned home with a specific, concrete list of what had gone wrong. Failure, in this case, was data.

Stanley and the Desert Conquered: 2005

Eighteen months later, the field returned. DARPA had doubled the prize to two million dollars and designed a new course: 132 miles of Nevada desert, including narrow mountain roads and hairpin turns.

Five vehicles finished.

The winner was Stanley, a modified Volkswagen Touareg built by the Stanford Racing Team under the direction of Sebastian Thrun, a German-born computer scientist and AI researcher who had previously worked on mobile robots at Carnegie Mellon. Stanley’s central innovation was its approach to learning: rather than programming terrain rules by hand, the team trained a machine learning system on data from human drivers navigating similar terrain. Stanley did not reason about the desert the way a human planner would; it matched sensor readings to patterns it had learned. It finished the 132-mile course in 6 hours and 53 minutes.

The result sent a signal through the engineering community that would prove decisive: the combination of machine learning, improved LIDAR sensors, and GPS had crossed some threshold. Robot vehicles could navigate unstructured environments at highway speeds. The question was no longer whether autonomous vehicles were possible, but how quickly the technology could be extended to ordinary roads, ordinary traffic, and ordinary weather.

Google’s Bet: The Secret Project That Changed Everything

In 2007, DARPA ran a third challenge — the Urban Challenge — set in a simulated city, requiring vehicles to navigate traffic, obey traffic laws, and interact with other vehicles. Carnegie Mellon’s Boss won. Sebastian Thrun’s Stanford team finished second. Both teams had demonstrated something qualitatively different from desert navigation: autonomous behavior in a social environment, where the vehicle’s actions affected and were affected by the behavior of others.

Sebastian Thrun was recruited to Google the following year. In January 2009, Google quietly launched a self-driving car project — later known as the Google Self-Driving Car Project, and eventually as Waymo. The project was not announced publicly for eighteen months. When it was, in October 2010, the cars had already driven over 140,000 miles on California roads without incident.

Warnung

The SAE International standard defines six levels of driving automation, from Level 0 (no automation) to Level 5 (full automation under all conditions). The distinction between levels is not cosmetic. Level 2 means the car can steer and control speed simultaneously, but the human driver must remain attentive and ready to take over at any moment. Level 4 means the vehicle can handle all driving in a defined geographic area without human intervention. Level 5 means the vehicle can drive anywhere, in any conditions, that a human could drive. Much of the commercial and journalistic discourse around self-driving cars conflated these levels — treating Level 2 systems as if they were proto-Level-5 systems, and treating company marketing projections as engineering roadmaps.

The Google cars used roof-mounted LIDAR arrays, radar, cameras, and GPS to build a real-time three-dimensional model of the world around them. They required detailed pre-surveyed maps of every road they traveled — maps that recorded not just the geometry of the lane markings but the positions of traffic lights, the timing of signals, the locations of crosswalks. This dependency on pre-surveyed maps would become one of the defining constraints of the technology: a system that required perfect prior knowledge of every road it would ever travel was a system that could not, by definition, drive everywhere.

Tesla Autopilot and the Marketing of Autonomy

While Google worked in secrecy on what it framed as a research project, Tesla took a different approach: it shipped.

In October 2014, Tesla released a hardware update to the Model S — Autopilot Hardware 1.0, a suite of forward-facing cameras, ultrasonic sensors, and a forward-looking radar. In October 2015, a software update activated Autopilot: the vehicle could now steer itself within lane markings, maintain speed, and change lanes with driver confirmation on highways. It was, by the SAE definition, a Level 2 system.

Elon Musk described it differently. In October 2015, he told journalists that “the car can do almost everything.” In 2016, he predicted that Tesla would demonstrate “coast-to-coast autonomous driving” by 2017, and that by 2018, a Tesla owner would be able to summon their car from New York to Los Angeles. He described the human in the driver’s seat during Autopilot operation as “a vestigial requirement” that existed only due to regulatory lag, not technical necessity. The feature was sold, from 2016 onward, under the name “Full Self-Driving” — a label that described a capability the system did not have and would not have for years.

The consequences of this framing were not hypothetical.

The First Fatality: May 7, 2016

On the evening of May 7, 2016, Joshua Brown, a forty-year-old former Navy SEAL and technology enthusiast from Canton, Ohio, was driving his Tesla Model S on a divided highway near Williston, Florida. Autopilot was engaged. A tractor-trailer crossed the highway in front of him.

The Autopilot system did not brake.

The Tesla’s camera, blinded by the white side of the truck against a bright sky, failed to detect the vehicle. The radar system, tuned to filter out overhead road structures to prevent false braking, classified the elevated trailer as a bridge and ignored it. The car passed under the trailer at highway speed. Joshua Brown was killed.

The National Highway Traffic Safety Administration investigated and concluded that Tesla’s Autopilot system had not violated any regulations — the system was, legally, a driver assistance technology that required human supervision, not an autonomous system. Musk continued using the phrase “Full Self-Driving.” Sales continued.

The incident established a pattern that would define the autonomous vehicle industry’s troubled relationship with public trust: a gap between what the technology could do, what it was marketed as doing, and what users actually did with it.

The Long Tail Problem

By 2018, autonomous vehicle developers had accumulated a new vocabulary for the problem they faced. They called it the long tail.

The phrase described a statistical reality: ninety-five percent of driving situations were routine. Highway cruising, urban intersections with clear signals, parking lots, merging on ramps — these could be handled, with sufficient training data and engineering effort, by a well-designed autonomous system. But the remaining five percent — unusual intersections, unexpected pedestrian behavior, construction zones, unusual weather, faded lane markings, ambiguous signals, the near-infinite combinatorial space of things that can go wrong on a road — did not shrink as systems improved. Each edge case that was solved exposed adjacent edge cases that had not yet been encountered.

The fundamental challenge was not that autonomous vehicles were bad at driving. It was that they were excellent at driving in the conditions they had been trained for, and unreliable in conditions they had not. Human drivers are also unreliable — they cause approximately 1.35 million deaths per year globally — but they bring a capacity for generalization, improvisation, and social negotiation that 2010s machine learning systems fundamentally lacked.

Uber’s Advanced Technologies Group discovered this problem at fatal cost. On March 18, 2018, a self-driving Uber vehicle struck and killed Elaine Herzberg, a 49-year-old woman walking a bicycle across a road in Tempe, Arizona. The vehicle’s sensors detected her six seconds before impact; the software classified her, in sequence, as an unknown object, a vehicle, and a bicycle, cycling between classifications and never triggering a braking response because no classification reached the confidence threshold required. The safety driver was watching a video on her phone. Uber suspended its autonomous vehicle testing program. The program never recovered its former scale.

Waymo One and the Limits of the Operational Domain

Google spun out its self-driving car project as an independent company, Waymo, in December 2016. In December 2018, Waymo launched Waymo One — the first commercial autonomous vehicle service in the United States, offering rides in Chandler, Arizona, a suburb of Phoenix selected for its wide roads, clear lane markings, mild weather, and known street geometry.

The service worked. Passengers took rides without safety drivers. The cars navigated intersections, handled pedestrians, and operated reliably within their defined operational domain.

The operational domain was the key phrase. Waymo One in 2018 worked in Chandler, Arizona. It did not work in rain, in snow, in Boston, in San Francisco’s densest neighborhoods, or on roads that had not been pre-surveyed. Expansion was slow, careful, and expensive. By 2023, Waymo had expanded to San Francisco, Phoenix, and Los Angeles — real, complex urban environments, a genuine achievement — but the total geographic coverage remained a small fraction of the roadway network that the most optimistic 2016 projections had imagined covering by 2020.

Info

The Anthony Levandowski affair illustrated the commercial stakes of self-driving technology. Levandowski had been one of the key engineers behind Google’s early self-driving cars, having built the first LIDAR systems for the project. In 2016, he left Google to found a startup, Otto, which was almost immediately acquired by Uber. In 2017, Waymo sued Uber, alleging that Levandowski had stolen 14,000 confidential files before departing. Uber settled for $245 million in equity. Levandowski was later convicted of trade secret theft and sentenced to eighteen months in prison — a sentence commuted by President Trump in January 2021. The case was a reminder that in technology races defined by training data and engineering insight, intellectual property is the battlefield.

The Optimistic Projections and Their Collapse

Between 2015 and 2017, the gap between projected and actual capability was at its widest. A partial list of public predictions made during this period:

Elon Musk (2016): “In approximately two years, the car will be able to drive from your home to work without you touching it.”
Lyft (2016): “In five years, most Lyft rides will be in autonomous vehicles.”
Ford (2017): A fully autonomous vehicle with no steering wheel or pedals would be in commercial production by 2021.
General Motors (2018): Commercial autonomous ride-hailing service launching in 2019.
Waymo CEO John Krafcik (2018): Self-driving cars were “the biggest opportunity we’ve seen in transportation in 100 years.”

None of these projections proved accurate. Ford’s 2021 vehicle never materialized; the program was restructured multiple times. GM’s Cruise launched a commercial service in San Francisco in 2022, then suspended it in October 2023 after one of its vehicles ran over a pedestrian who had already been struck by a human driver — and, critically, the Cruise vehicle dragged the pedestrian 20 feet before stopping. The California DMV revoked Cruise’s permit. GM wrote down billions of dollars invested in the program.

The projections had failed not because the engineers were dishonest — many of them believed what they said — but because they had underestimated the depth of the long tail. Progress in machine learning was rapid and real, but it was asymptotic: each doubling of training data and compute produced a smaller increment of capability improvement. The scenarios that remained dangerous were precisely the scenarios that, by definition, appeared least often in training data.

Dead End: The 2020 Autonomy That Never Arrived

The most consequential dead end in autonomous vehicle history was not a specific technology or a specific company: it was a shared, industry-wide set of projections that proved to be wrong by nearly a decade.

The failure mode was one that the history of technology should have predicted. Every genuinely transformative technology — nuclear power, the Concorde, virtual reality in the 1990s, the Segway — has been accompanied by a period of inflated expectations in which the technology’s proponents extrapolate from early progress as if the remaining progress would be equally fast and linear. Autonomous vehicles were no different. The jump from zero vehicles completing the 2004 DARPA challenge to five vehicles completing the 2005 challenge was so dramatic that it seemed to suggest a development curve that would reach Level 5 autonomy within a decade.

The suggestion was wrong. What the 2004–2005 jump demonstrated was that the simplest version of the problem — navigating unstructured desert terrain without other vehicles, pedestrians, cyclists, or ambiguous social signals — had become tractable. The more complex version of the problem, which is the actual problem of driving in human society, proved to require not just better sensors and faster processors, but a quality of contextual judgment and generalization that the deep learning architectures of the 2010s did not provide.

By 2024, the honest assessment was roughly this: robotaxis operated by companies like Waymo worked reliably within tightly defined geographic and weather constraints, and were commercially viable in those contexts. The broader vision of a personal vehicle that could drive its owner anywhere, in any conditions, without human supervision — Level 5 autonomy — remained an engineering problem without a clear solution timeline.

This did not mean the research was wasted. The sensors, mapping techniques, and machine learning architectures developed for self-driving cars migrated into industrial robotics, logistics automation, advanced driver assistance systems, and autonomous systems in controlled environments such as mines, ports, and warehouses. The long road to Level 5 produced, along the way, a great deal of technology that found homes in Level 2 and Level 3 applications.

But the cars that would drive themselves anywhere, in any weather, by 2020, did not arrive on schedule. The desert that humiliated the first generation of autonomous vehicles turned out, on closer inspection, to be only the first of many deserts.