Zum Inhalt springen

The Container Revolution

Zusammenfassung

On March 13, 2013, Solomon Hykes demonstrated Docker at PyCon. Within two years, Docker had become the most rapidly adopted developer tool in computing history, transforming how software was built, shipped, and run. On June 7, 2014, Google open-sourced Kubernetes — the container orchestration system it had built to run its own containerized workloads at planetary scale. By 2020, Kubernetes had become the operating system for cloud-native software: the substrate on which microservices were deployed, scaled, and managed. The container revolution did not change what software does. It changed where software runs, how reliably it runs there, and who can deploy it.

The Problem Containers Solved

Before containers, deploying software was a process of negotiating with environments. A program written and tested on a developer’s laptop needed to be installed on a production server that might have different versions of system libraries, different configurations, different operating system versions. “It works on my machine” was not a joke but a description of a genuine and pervasive operational problem.

Virtual machines partially solved this by packaging the entire operating system along with the application. A VM included a kernel, system libraries, and the application itself — a complete software environment that would behave identically on any host that could run a hypervisor. VMs were portable but heavy: a VM image typically included gigabytes of OS overhead, and spinning up a new VM took minutes.

Containers used a different mechanism. Rather than virtualizing the hardware (as VMs do), containers shared the host operating system kernel while isolating the application’s view of the file system, process space, network interfaces, and user IDs. This isolation was achieved through Linux kernel features: namespaces (introduced in the Linux kernel in 2002–2008) and cgroups (control groups, contributed by Google to the Linux kernel in 2007).

A container included the application and its dependencies (libraries, runtime, configuration) but shared the host’s kernel. Containers were lighter than VMs: a container image might be tens of megabytes rather than gigabytes, and container startup time was measured in milliseconds rather than minutes. A single physical host could run hundreds of containers where it might run only tens of VMs.

Linux had had the underlying kernel features for containers since the mid-2000s. LXC (Linux Containers), introduced in 2008, exposed these features as a user-space API. What was missing was a tool that made containers easy to create, share, and use.

Docker: The Developer Experience

Solomon Hykes founded dotCloud in 2008 as a platform-as-a-service (PaaS) company. Like all PaaS companies of the era, dotCloud ran customer applications in isolated environments — and had built internal tooling to create and manage those environments. The internal tool was called Docker.

Hykes’s 2013 PyCon demonstration was five minutes long. He showed how Docker made it trivial to package an application into a container, run that container locally, and push it to a registry from which anyone with Docker installed could pull and run the identical container. The demonstration was technically simple; the implications were profound.

Docker solved three problems simultaneously:

  1. Build: A Dockerfile was a text file describing how to construct a container image — starting from a base image, adding files, installing packages, setting environment variables. Any developer could read a Dockerfile and understand exactly what the container contained.

  2. Ship: Docker images could be pushed to a registry (initially Docker Hub) and pulled by any Docker installation anywhere. The container that ran on a developer’s laptop would run identically on a production server.

  3. Run: Running a container required a single command: docker run. The complexity of configuring environments, managing dependencies, and ensuring isolation was encapsulated in the container image.

Docker’s GitHub repository became one of the fastest-growing open source projects in history. The number of Docker Hub image pulls grew from zero to 100 million within the first year of public availability. Within two years, Docker was the standard way to package and distribute software in the enterprise, competing with and often displacing traditional deployment methods.

Docker Was Not New Technology

Docker’s underlying technology — Linux namespaces and cgroups — had existed for years. Google had been running containerized workloads internally (using their internal system, Borg) since approximately 2004. What Docker contributed was not new kernel technology but a developer-friendly interface, a content-addressable image format, and a public registry. The innovation was a UX innovation: making existing technology easy enough to use that it became ubiquitous.

Kubernetes: Orchestrating at Scale

Running a single container was easy with Docker. Running thousands of containers across hundreds of servers — handling failures, load balancing, rolling updates, resource allocation — was an entirely different problem.

Google had been running containerized workloads at scale since approximately 2004 through an internal system called Borg. Borg managed tens of thousands of containers across Google’s data centers, handling scheduling, failure recovery, resource management, and monitoring. It was one of the most sophisticated distributed systems ever built.

In 2013, several Google engineers — including Joe Beda, Brendan Burns, and Craig McLuckie — began working on an open-source version of Borg’s concepts. They called it Kubernetes (Greek for helmsman or pilot; the same root as Norbert Wiener’s “cybernetics”). Google open-sourced Kubernetes in June 2014.

Kubernetes introduced a declarative model for running containers at scale: rather than issuing imperative commands (“run this container on that server”), operators described the desired state of the system (“I want 10 replicas of this container, distributed across availability zones, with these resource limits”), and Kubernetes continuously reconciled actual state with desired state. If a container died, Kubernetes restarted it. If a server failed, Kubernetes rescheduled the containers to other servers. If traffic increased, Kubernetes scaled the number of container replicas.

The core Kubernetes abstractions:

  • Pod: The smallest deployable unit — one or more containers that share a network namespace and storage.
  • Deployment: A desired state for a set of Pods, including replica count and update strategy.
  • Service: A stable network endpoint that routes traffic to a set of Pods, independent of which Pods are currently running.
  • Namespace: A virtual cluster within a Kubernetes cluster, used for isolation between teams or environments.

In 2016, the Cloud Native Computing Foundation (CNCF) was established to host Kubernetes and related projects. Google donated Kubernetes to the CNCF, ensuring that no single company controlled the project. The CNCF ecosystem grew to include hundreds of projects: service meshes (Istio, Linkerd), monitoring (Prometheus, Grafana), packaging (Helm), and service discovery (Consul).

Cloud-Native: The New Software Architecture

The container revolution coincided with and reinforced the microservices architectural pattern: decomposing applications into small, independently deployable services that communicate over network APIs. Microservices and containers are not technically coupled — you can run microservices without containers and containers without microservices — but they emerged together and reinforced each other.

Containers made microservices operationally tractable. Each microservice could be packaged as a container with its own dependencies, deployed independently, scaled independently, and updated without affecting other services. The operational isolation that containers provided aligned naturally with the architectural isolation that microservices required.

Cloud providers adopted Kubernetes rapidly. Google launched Google Kubernetes Engine (GKE) in 2014, AWS launched Elastic Kubernetes Service (EKS) in 2018, and Azure launched Azure Kubernetes Service (AKS) in 2018. Kubernetes became the default deployment target for cloud-native applications across all three major clouds.

The CNCF’s 2021 survey found that 92% of organizations used containers in production and 83% used Kubernetes. By mid-2023, Kubernetes was running in the majority of large enterprise technology organizations and was the de facto standard for deploying cloud-native applications.

The Cost: Operational Complexity

The container revolution improved software deployment reliability and reproducibility. It also significantly increased operational complexity for teams that hadn’t previously needed to think about distributed systems.

Running Kubernetes in production requires understanding networking abstractions (CNI plugins, Service types, Ingress controllers), storage abstractions (PersistentVolumes, StorageClasses), security models (RBAC, Pod Security Standards, network policies), observability (metrics, logs, traces, alerts), and upgrade procedures that can affect running workloads. A team that previously deployed a monolithic application on a few VMs might spend months learning Kubernetes to deploy the same application as microservices in containers.

The platform engineering discipline emerged in response: specialized teams building internal developer platforms (IDPs) that abstracted Kubernetes complexity from application developers, providing simplified deployment interfaces while managing the underlying infrastructure. The tooling ecosystem around this — Backstage (Spotify’s internal developer platform, open-sourced in 2020), Crossplane, Argo CD — created an entire category of infrastructure software.


📚 Sources