The Cloud Computing Era: When the Server Disappeared

Zusammenfassung

This article traces the history of cloud computing — from the capacity-planning nightmare of the physical server era, through Amazon’s accidental invention of a new industry in 2006, to the containerization revolution and the serverless model. It is the story of how “someone else’s computer” became the dominant infrastructure of the modern world, why three companies came to control most of it, and why the private-cloud alternative — despite billions invested — largely failed to compete.

The Problem Cloud Solved: Buying for the Peak

Before cloud computing, every organization that ran software on the internet owned its servers. This created a structural problem with no clean solution.

Web traffic is not uniform. A news site receives ten times its normal traffic when a major story breaks. A retail site receives fifty times its normal load on the day after Thanksgiving. A startup might receive nothing for months, then be featured in a major publication and receive a million requests in an hour. In each case, the organization had to decide how much hardware to buy — and the answer to that question determined both cost and resilience.

Buying for average load meant the site went down under peak traffic. Buying for peak load meant paying for expensive hardware that sat idle 90% of the time. There was no right answer, only different shapes of wrong.

The dominant solution of the late 1990s and early 2000s was colocation: renting rack space in a shared data center, with managed power and network connectivity, and filling it with servers you owned or leased. Companies like Exodus Communications, Rackspace, and Savvis built enormous facilities and rented floor space. The hardware was still yours to buy, configure, maintain, and eventually discard. The only thing being shared was the building.

Virtualization offered a partial improvement. VMware, founded in 1998 by Diane Greene, Mendel Rosenblum, and others, developed software that allowed a single physical server to run multiple isolated virtual machines — each behaving like an independent computer, with its own operating system and applications. A server that had been running one application at 10% CPU utilization could now run ten virtual machines at 10% each, dramatically improving hardware utilization.

VMware’s hypervisor technology became ubiquitous in enterprise data centers through the 2000s. It solved the utilization problem for organizations that owned their hardware. It did not solve the capacity elasticity problem: when traffic doubled, you still needed more physical hardware, and buying and installing physical hardware took weeks.

Amazon’s Accidental Industry

The origin story of Amazon Web Services is, by now, well-documented — and still surprising.

Amazon had spent the early 2000s rebuilding its internal infrastructure. The company’s rapid growth in the late 1990s had produced a tangle of interconnected systems that could not scale cleanly. Starting around 2002, under the direction of Jeff Bezos, Amazon re-architected its infrastructure as a set of independent services communicating through well-defined APIs. Each team owned a service; other teams consumed it through the API; no team was allowed to access another team’s data directly.

The discipline was painful. It required decomposing years of accumulated complexity into clean interfaces. But the result was an internal infrastructure that Amazon could operate, scale, and modify piece by piece. And the pieces — storage, compute, databases, messaging queues — turned out to be useful not just to Amazon but to everyone who ran software.

In 2004, Andy Jassy — Bezos’s technical advisor — began developing the concept of selling Amazon’s infrastructure capabilities as external services. The insight was direct: Amazon had built, at scale and under commercial pressure, exactly the kind of elastic, programmable infrastructure that every startup and enterprise needed and could not afford to build themselves.

On March 14, 2006, Amazon launched Amazon Simple Storage Service (S3): object storage billed by the gigabyte, accessible via API, with no hardware to buy or manage. Three months later, Elastic Compute Cloud (EC2) launched in beta: virtual servers on demand, billed by the hour, available within minutes.

The pricing was striking. An EC2 instance cost $0.10 per hour — less than the cost of a cup of coffee for an hour of computing. A startup could run a web service with no upfront hardware investment, scale up when traffic arrived, and scale back down when it left. The capacity-planning problem did not disappear; it became someone else’s problem.

“Utility Computing” — An Old Idea Whose Time Had Come

The idea that computing could be sold like electricity — metered, on-demand, from a shared grid — had been proposed since the 1960s. John McCarthy suggested it at MIT in 1961; the term “utility computing” circulated through the 1990s. What made AWS different was not the concept but the execution: a commercial organization with the scale to make the economics work, an API-driven interface that made automation practical, and a credit-card billing model that allowed individual developers to use it without a corporate procurement process. The vision was old; the product was new.

Google’s Paper Trail and the Big Data Ecosystem

Amazon was not the only company building at scale. Google, by the early 2000s, was running one of the largest computing infrastructures ever assembled — thousands of cheap commodity servers, distributed across data centers, handling billions of search queries. Its engineers had solved problems that the academic literature had not yet addressed, and in 2003 and 2004, Google published two papers that reshaped the industry.

“The Google File System” (Ghemawat, Gobioff, and Leung, SOSP 2003) described a distributed file system designed to store petabytes of data across thousands of commodity machines, tolerating hardware failures as routine rather than exceptional events. “MapReduce” (Dean and Ghemawat, OSDI 2004) described a programming model for processing large datasets in parallel across a cluster — splitting work into a “map” phase that processed data in parallel and a “reduce” phase that aggregated results.

Google did not release the software. It published the ideas. Within two years, engineers at Yahoo had used the papers to build Hadoop — an open-source implementation of both the distributed file system (HDFS) and MapReduce. Hadoop became the foundation of the “big data” industry of the late 2000s and early 2010s, enabling organizations to store and process datasets too large for any single database. The open-source infrastructure that made Hadoop possible is part of the story told in The Open Source Revolution.

Google’s own cloud services — what became Google Cloud Platform — arrived late. Google App Engine launched in 2008; comprehensive infrastructure services comparable to AWS did not exist until 2011–2012. Google had the technology; it lacked the institutional appetite to sell infrastructure to competitors, a reluctance that cost it years of market share.

Microsoft Azure, announced in 2008 and generally available in 2010, was the more significant entry. Microsoft’s enterprise relationships, existing data center infrastructure, and the integration of Azure with Windows Server and SQL Server gave it a credible path to enterprise customers for whom AWS felt like a startup risk. Azure grew to become the second-largest cloud platform, with particular strength in hybrid deployments — organizations running some workloads on-premises and some in the cloud.

By 2015, the cloud market had consolidated into three dominant platforms: AWS, Azure, and Google Cloud. The infrastructure of the modern internet was running on hardware owned by three companies.

The Container Revolution

Virtualization had separated software from the physical hardware. A new abstraction layer — containers — took the next step: separating software from the operating system.

A virtual machine runs a complete operating system — its own kernel, its own system libraries, its own processes. This isolation is thorough but expensive: each VM consumes hundreds of megabytes just to boot, and running fifty VMs on a server means running fifty operating system instances.

A container shares the host operating system’s kernel but isolates its own filesystem, processes, and network stack. It starts in seconds rather than minutes, consumes megabytes rather than gigabytes of overhead, and can run thousands of containers on a single server.

Linux had supported the underlying mechanisms — cgroups and namespaces — since the late 2000s. What was missing was a usable interface.

Solomon Hykes, a French-American programmer, demonstrated Docker at PyCon in March 2013. Docker packaged the Linux container primitives into a simple command-line tool and an image format that bundled an application with all its dependencies. A developer could build a container image on their laptop, push it to a registry, and run an identical container in production — eliminating the “works on my machine” class of deployment failures.

Docker’s adoption was rapid. By 2014, every major cloud provider supported Docker containers. But running containers at scale — scheduling them across thousands of servers, managing failures, routing traffic — required new infrastructure.

Kubernetes was Google’s answer. Google had been running containers internally since the mid-2000s, under a system called Borg. In 2014, Google released Kubernetes — an open-source container orchestration system derived from Borg’s ideas — and donated it to the newly formed Cloud Native Computing Foundation (CNCF). The donation was strategic: Google could not win the cloud market with proprietary infrastructure, but an open standard for container orchestration that ran equally well on AWS, Azure, and Google Cloud would commoditize the competition and play to Google’s engineering reputation.

Kubernetes became the standard. By 2020, it was running in the majority of large-scale production deployments. The containerization model it enabled — applications packaged as immutable images, deployed declaratively, scaled automatically — changed how software was built as profoundly as the shift to object-oriented programming had in the 1990s.

Serverless: The Logical Extreme

If virtualization abstracted hardware and containers abstracted operating systems, serverless computing abstracted servers entirely.

AWS Lambda, launched in November 2014, allowed developers to deploy individual functions — pieces of code — without provisioning any server, virtual or otherwise. You uploaded a function; AWS ran it in response to events (an HTTP request, a file upload, a database change) and billed you per invocation, in increments of milliseconds. If no one called your function, you paid nothing. If a million people called it simultaneously, AWS scaled automatically.

The name “serverless” was immediately controversial — there were obviously servers involved, just not visible to the developer. But the name captured the model accurately from the developer’s perspective: you wrote functions, you defined when they ran, and infrastructure ceased to be your concern.

Serverless proved powerful for event-driven workloads — image processing, API backends, scheduled jobs — and less suitable for long-running or stateful applications. It also intensified the lock-in question: a function written for AWS Lambda did not trivially run on Azure Functions or Google Cloud Functions, even though all three offered nominally similar services.

Dead End: The Private Cloud

The success of AWS triggered a parallel effort: if public cloud was powerful, could enterprises build their own, on hardware they controlled? The private cloud promised AWS-like elasticity and automation without the data sovereignty concerns, vendor lock-in, or perceived security risks of running on Amazon’s infrastructure.

OpenStack — an open-source cloud platform launched by NASA and Rackspace in 2010, backed by a coalition of hundreds of companies — was the most prominent attempt to build a private-cloud standard. It provided the APIs, the software-defined networking, the storage management: everything needed to build an AWS equivalent in your own data center.

The Operational Complexity Trap

Private cloud failed not because the technology was wrong but because the economics did not work for most organizations. Operating a private cloud required a dedicated team of specialists to install, configure, upgrade, and maintain the platform — the same people Amazon had already hired and amortized across millions of customers. For all but the largest enterprises, the cost of that operational expertise exceeded the cost savings from not paying AWS margins.

OpenStack specifically suffered from a fragmentation problem: each vendor’s distribution was slightly different, documentation was incomplete, upgrades were difficult, and the organizations deploying it often lacked the engineering depth to operate it reliably. By the late 2010s, many organizations that had built private clouds began migrating workloads to public cloud anyway. The private cloud had solved the data sovereignty question; it had not solved the operational burden. VMware’s acquisition by Broadcom in 2023 — at $61 billion — and the subsequent price increases further accelerated the migration.

The “hybrid cloud” model — some workloads on-premises, some on public cloud — emerged as the practical compromise for large enterprises. Pure private cloud, as an alternative to public cloud, largely ceased to be a credible strategy.

Legacy: The Infrastructure Question

By the early 2020s, the majority of internet services ran on AWS, Azure, or Google Cloud. The cloud had made it possible for a two-person startup to deploy globally reliable infrastructure in an afternoon. It had also concentrated control of that infrastructure in three American companies.

The economic consequences of the shift from capital expenditure (CapEx — buying servers) to operational expenditure (OpEx — renting compute) were enormous. Startups no longer needed millions in venture capital to buy hardware before writing a line of code. The marginal cost of deploying software approached zero. The barriers to building internet services fell; the barriers to building cloud infrastructure rose.

The sovereignty debate — who should control the infrastructure of a nation’s digital economy — became politically significant. The European Union funded GAIA-X, a framework for European cloud infrastructure, in 2019. France, Germany, and other countries mandated that certain categories of government data not leave European data centers. Whether these efforts would reduce dependence on American cloud providers, or merely create a compliance layer over AWS and Azure European regions, remained an open question.

The question of cloud lock-in — whether moving to the cloud meant trading one form of vendor dependence for another — was never fully resolved. Moving off AWS was possible; it was expensive and time-consuming in ways that the initial migration had not been.

For the open-source software that runs inside the cloud — Linux, Kubernetes, PostgreSQL — see The Open Source Revolution. For the networking infrastructure beneath it, see The Connected World. For the databases the cloud hosts at scale, see The Database Revolution.