Michael Stonebraker and Postgres
Zusammenfassung
Michael Stonebraker has spent fifty years turning database research papers into running systems and running systems into companies — nine of them and counting. At Berkeley in the 1970s he built Ingres, the first widely available proof that Edgar Codd’s relational model could actually be implemented efficiently; its freely distributed source code trained the engineers who built Sybase and, through it, Microsoft SQL Server. His follow-up, Postgres, pioneered the extensible object-relational database and lives on as PostgreSQL, arguably the most beloved database in the world. At MIT after 2001 he declared war on the very orthodoxy he had helped create — “one size fits all” is over — and spun out specialized engines for columns (Vertica), streams (StreamBase), and main memory (VoltDB). The 2014 Turing Award called his contributions fundamental to “the concepts and practices underlying modern database systems”; he remains the field’s most productive contrarian.
Reading Codd in Berkeley
Michael Stonebraker (born October 11, 1943 in Newburyport, Massachusetts) took an electrical engineering degree at Princeton (1965) and a Ph.D. at Michigan (1971) on a topic he later cheerfully dismissed — applied mathematics, nothing to do with data. Arriving at UC Berkeley as an assistant professor in 1971, he needed a research direction. He found it in Edgar Codd’s papers on the relational model: data as tables, queried declaratively, with the system — not the programmer — figuring out access paths. IBM’s own researchers were building System R to test the idea; Codd’s employer was otherwise in no hurry (its hierarchical IMS paid the bills). In 1973, Stonebraker and his colleague Eugene Wong decided Berkeley should build the proof instead.
Ingres: The Relational Model, Running
Ingres (Interactive Graphics and Retrieval System) was developed through the mid-1970s on Unix and DEC minicomputers — an academic project that became one of the first demonstrations that a relational database could be practical and efficient, not just elegant. Its query language QUEL tracked Codd’s relational calculus closely; its implementation worked out machinery the whole industry would need: B-tree storage, query modification for views and integrity, and primitives for access control.
As important as the system was the distribution model: Berkeley shipped Ingres source tapes to anyone for a nominal fee, years before “open source” was a phrase. Perhaps a thousand sites ran it, and its code became the common schooling of the young database industry — Sybase (co-founded by Ingres alumnus Robert Epstein) descended from that lineage, and Sybase’s engine, licensed to Microsoft, became SQL Server. Stonebraker himself co-founded Relational Technology, Inc. (later Ingres Corporation) in 1980 to commercialize the system, entering the market just as Larry Ellison’s Oracle was proving that aggressive selling beats elegant engineering (see The Database Wars).
Postgres: The Database You Can Extend
Back at Berkeley, Stonebraker started over. Postgres — post-Ingres, begun in 1986 — attacked the relational model’s rigidity: classic systems knew integers, floats, and strings, and nothing else. Postgres let users define their own data types, functions, and operators inside the database — geometric types with spatial indexes, time series, later JSON and vectors — plus rules and an unusual no-overwrite storage design. This object-relational blueprint was commercialized as Illustra (its plug-in extensions marketed as “DataBlades”), which Informix bought in 1996, taking Stonebraker along as CTO; the big vendors then copied extensibility into their own engines.
The deeper legacy was accidental. In 1994–95, Berkeley graduate students Andrew Yu and Jolly Chen replaced Postgres’s QUEL-derived query language with SQL, released it as Postgres95, and handed it to the internet. Renamed PostgreSQL in 1996 and developed ever since by an independent volunteer community, it became the open-source database of record — the default choice of 2020s startups, the engine under countless cloud services, and perennial winner of developer-survey affection (see The Database Revolution). Stonebraker has called Postgres his most enduring achievement, with the caveat that its survival owed much to people he never supervised.
MIT and the War on “One Size Fits All”
Moving to MIT’s CSAIL in 2001, Stonebraker turned on the orthodoxy his own systems had established. The classic row-store architecture, he argued in an influential 2005 paper with Uğur Çetintemel, was “an idea whose time has come and gone”: designed for 1970s business transactions, it was being outperformed by an order of magnitude or more in every specialized arena. He made the case the way he always had — by building the specialists and selling them. C-Store became Vertica (column storage for analytics; acquired by HP in 2011). Aurora/Borealis became StreamBase (continuous queries over event streams; acquired by TIBCO). H-Store became VoltDB (lock-light, main-memory transaction processing). SciDB targeted scientific arrays, Tamr machine-assisted data integration, DBOS an operating system built atop a database. The argument stuck: today’s landscape of purpose-built engines — analytic column stores, streaming systems, time-series and vector databases (see The Big Data Revolution) — is the world his heresy predicted.
He received the 2014 Turing Award for “fundamental contributions to the concepts and practices underlying modern database systems.”
⚠️ Dead End: QUEL — The Better Language That Lost
Ingres and early Postgres spoke QUEL, and on technical merits many — Stonebraker loudly among them — considered it superior to SQL: closer to Codd’s calculus, more composable, free of SQL’s irregular corner cases (C. J. Date’s famous critiques of SQL largely read as advertisements for QUEL). It didn’t matter. IBM’s System R spoke SQL, IBM’s gravity made SQL the presumptive standard, Oracle rode it to market, and ANSI standardization (1986) finished the job: every vendor, Ingres included, had to bolt on SQL, and QUEL withered into a footnote (see Fun Fact: SQL was SEQUEL). Postgres’s own student-built switch from POSTQUEL to SQL was precisely what unlocked its popularity. The lesson is the database industry’s standing reminder that standards beat elegance: the network effects of a query language outweigh its design quality — a verdict Stonebraker accepted in practice and has grumbled about for forty years.
Fun Fact: The Land Sharks Are on the Squawk Box
Stonebraker’s Turing lecture, published in CACM under the title “The Land Sharks Are on the Squawk Box,” interleaves the history of his systems with an account of riding a bicycle across America — his metaphor for the startup experience, complete with mountain passes (venture capitalists, the “land sharks”) and headwinds (large incumbents’ marketing departments). It is likely the only Turing lecture structured around a bike trip.
📚 Sources
- Wikipedia: Michael Stonebraker · Ingres · PostgreSQL
- ACM A.M. Turing Award: Michael Stonebraker (2014) — citation and profile
- Stonebraker & Held: INGRES — A Relational Data Base System (1975)
- Stonebraker & Rowe: The Design of POSTGRES (SIGMOD 1986, PDF)
- PostgreSQL documentation: A Brief History of PostgreSQL
- Stonebraker & Çetintemel: “One Size Fits All” — An Idea Whose Time Has Come and Gone (ICDE 2005, PDF)
- Stonebraker: The Land Sharks Are on the Squawk Box (CACM, 2016)
- MIT CSAIL: Michael Stonebraker wins Turing Award (2015)