John McCarthy and LISP

Zusammenfassung

John McCarthy coined the term “artificial intelligence” in 1956 and spent the next fifty years trying to build it. He invented LISP in 1958 — a language so structurally different from FORTRAN that it effectively defined a second paradigm of programming that persists to the present day. He pioneered time-sharing, the concept that multiple users could simultaneously share a single computer. He developed the logic-based approach to AI reasoning that dominated the field for three decades. And when neural networks eventually surpassed symbolic AI on practical tasks, he found the vindication incomplete: he had always believed that genuine machine intelligence required formal reasoning, not statistical pattern matching.

Boston, Caltech, and a Mind That Would Not Stop

John McCarthy was born in Boston on September 4, 1927, to working-class Irish-Catholic parents — his father was a union organizer, his mother an activist in various progressive causes. The family moved to Los Angeles when McCarthy was a child, partly for the warmer climate (his mother had tuberculosis, a sentence that often meant slow death in cold, damp cities). Growing up in Los Angeles in the Depression years, McCarthy borrowed calculus textbooks from Caltech students and taught himself advanced mathematics before he was old enough to attend university.

He enrolled at Caltech at sixteen, majoring in mathematics, and graduated in 1948. He went to Princeton for graduate work and completed a PhD in mathematics in 1951, working on differential equations. But by then his intellectual center of gravity had shifted. At Princeton he had encountered John von Neumann’s work on self-reproducing automata and met Claude Shannon, who was developing information theory. He was increasingly gripped by a single question: Could machines think?

The question was not merely philosophical. McCarthy believed it was a technical question, amenable to technical answers, and he intended to find them.

The Dartmouth Conference and the Naming of a Field

In the summer of 1956, McCarthy organized a two-month workshop at Dartmouth College in Hanover, New Hampshire. The co-organizers were Claude Shannon, Marvin Minsky (then at Harvard), and Nathaniel Rochester of IBM. The proposal for the workshop, primarily written by McCarthy, contained the phrase that named the research program:

“The study is to proceed on the basis of the conjecture that every aspect of learning or any other feature of intelligence can in principle be so precisely described that a machine can be made to simulate it.”

McCarthy called the proposed research “artificial intelligence” — a deliberate choice that separated it from Norbert Wiener’s cybernetics, from von Neumann’s automata theory, and from the various psychological and philosophical traditions that had previously claimed the question of machine thought. The term was chosen for its clarity and ambition: intelligence, not simulation of intelligence; artificial, meaning made by human art, not supernatural or biological.

The Dartmouth Conference itself was loosely organized and only partially attended. Of the ten people invited, most came and went, and the intense collaborative synthesis McCarthy had hoped for did not materialize. But the conference is nonetheless regarded as the founding event of artificial intelligence as a research discipline, because it was there that the field got its name, its institutional identity, and its first sense of shared purpose. Allen Newell and Herbert Simon demonstrated the Logic Theorist, the first AI program to prove mathematical theorems. McCarthy worked on his own programs. Minsky talked about neural nets. The field was launched, however messily.

LISP: A Language for Thinking Machines

McCarthy joined MIT in 1958, and the problem he faced immediately was practical: the existing programming languages were wrong for AI research. FORTRAN, the dominant language of the era, was built for numerical computation — arrays of numbers, arithmetic operations, do-loops. AI programs needed something different. They needed to manipulate symbols: words, logical expressions, program code, rules of inference. They needed recursive data structures: trees, graphs, lists of lists. They needed the ability to represent and manipulate their own reasoning processes.

McCarthy designed LISP (List Processing) to address these needs. The core data structure was the list — a sequence built from pairs, where each pair contained a value and a pointer to the next pair. Lists could contain other lists, so arbitrary tree structures were natural. And in McCarthy’s most consequential design decision, programs themselves were represented as lists. A LISP function was a list. A LISP program was a list. This property — code and data sharing the same representation, known as homoiconicity — made LISP uniquely suited to programs that reasoned about programs: AI systems that reflected on their own processes, or that manipulated logical expressions that happened to also be programs.

LISP introduced several features that subsequently appeared in mainstream programming languages only decades later. First-class functions: functions could be passed as arguments, returned as values, and stored in data structures, with no distinction between “functions” and “data.” Garbage collection: the programmer did not need to manage memory manually; the runtime tracked which memory was still reachable and reclaimed the rest automatically. Dynamic typing: the type of a value was determined at runtime, not at compile time, allowing programs to work with heterogeneous data naturally. An interactive read-eval-print loop: you could type an expression, have it evaluated immediately, and see the result — a fundamentally different relationship with the computer than the batch-processing model of submit a punch card deck, wait hours, collect output.

In a remarkable 1960 paper in Communications of the ACM, McCarthy described a minimal LISP interpreter that could be written in LISP itself — a self-description of the language in the language. The paper was read as a theoretical elegance; it also demonstrated that LISP’s semantics were so clean that they admitted a self-referential completeness that no other language of the era could match. Paul Graham later called it “Maxwell’s equations of software.”

Info

The LISP evaluator, in its minimal form, consisted of seven primitive functions (atom, car, cdr, cons, eq, quote, and cond) plus a self-application construction. Everything else — arithmetic, string handling, I/O, even the object system — could be built from these primitives. This radical minimalism was philosophically significant: it showed that computation was not fundamentally about numbers or arrays, but about symbolic manipulation, and that a tiny foundation could support an enormous superstructure.

Time-Sharing: The Computer as Utility

In 1959, McCarthy wrote a memo to Philip Morse, the director of MIT’s Computation Center, proposing an idea that seemed almost obvious once stated but had no precedent: computers should be shared among many users simultaneously, with each user experiencing the machine as if it were dedicated to their work. Time-sharing, he called it — the computer would allocate fractions of its attention to each user in rapid rotation, fast enough that the human at the terminal perceived immediate response.

The economic argument was compelling. Computers in 1959 cost hundreds of thousands of dollars; for much of the day they sat idle, waiting for a human to type the next command. Time-sharing would amortize that cost across hundreds of simultaneous users. But the deeper implication was transformative: it changed computing from a batch industrial process — submit job, wait, collect output — into an interactive, personal, explorative activity.

McCarthy wrote in 1961 that computing utilities might eventually become as widespread as telephone utilities — that a home connected to a central computing system might work with a shared computer the way it worked with the electrical grid. He was predicting the internet and cloud computing approximately forty years before they existed. His immediate proposal led to the MIT Compatible Time-Sharing System (CTSS), the first working time-sharing system, operational by 1961 — and eventually to the entire tradition of interactive computing that now encompasses every personal computer, smartphone, and web application.

Stanford and the Ambitions of Symbolic AI

In 1962, McCarthy left MIT for Stanford, where he founded the Stanford Artificial Intelligence Laboratory (SAIL). SAIL became one of the two main centers of AI research in the United States (the other being MIT’s AI Lab, led by Marvin Minsky), and under McCarthy it pursued a distinctive approach: logic-based reasoning, formal representation of knowledge, explicit symbol manipulation.

McCarthy’s theoretical contributions in this period were significant. His situation calculus (1963) provided a logical framework for reasoning about a changing world: situations (world states), actions that transformed situations, and fluents (properties that varied across situations). This framework became the standard tool for AI planning systems. His Advice Taker proposal (1959) envisioned a program that could be given new knowledge in the form of logical statements and immediately apply that knowledge to new situations — a vision of general reasoning that anticipated later work on knowledge representation by decades.

He also identified the frame problem: when an action changes some things in the world, how does a reasoning system efficiently represent all the things that did not change? If you move a cup, its position changes — but its color, the positions of other objects, the laws of physics remain the same. Naively enumerating all non-changes is impractical. Finding compact, implicit representations of persistence was surprisingly hard and generated decades of philosophical and technical work.

McCarthy’s deeper commitment was to formal logic as the correct framework for intelligence. He believed that any truly general intelligent system would need to reason explicitly, represent propositions symbolically, and manipulate those propositions by formal inference rules. Neural networks — which learned patterns from data without explicit symbolic representations — struck him as interesting but insufficient, capable of narrow performance but not of genuine reasoning.

The AI Winters and Their Aftermath

Symbolic AI, as McCarthy and others practiced it, produced impressive results in narrow domains: chess programs, theorem provers, expert systems for diagnosis and decision support. But it repeatedly failed to scale to the complexity of the real world. Systems that were brilliant within their domain broke completely at the boundary; systems that worked with clean, formal knowledge representations failed when knowledge was uncertain, contextual, or tacit.

The first AI winter came in the mid-1970s, when both the US and UK governments, disappointed by progress, drastically cut AI funding. The second came in the late 1980s, triggered by the collapse of the expert systems market and the failure of large-scale logic-based systems to deliver on their commercial promises. Each winter was followed by a thaw, as new techniques (connectionism in the late 1980s, statistical methods in the 1990s, deep learning in the 2000s) produced genuine advances.

McCarthy received the Turing Award in 1971 “for his major contributions to the field of artificial intelligence, having pioneered design of LISP.” He was not, in 1971, near the end of his work; he continued at Stanford for four more decades, producing papers on formal logic, common sense reasoning, and the philosophical foundations of AI. He acknowledged the successes of statistical and neural approaches while maintaining his core conviction: that genuine machine intelligence, capable of open-ended reasoning across domains, required symbolic representations and explicit inference, not pattern recognition from data.

“The time machine is probably simpler than the space ship,” he said in a characteristic aphorism — meaning that truly difficult problems in AI were further away than they appeared, not because the problems were impossible but because the shortcuts taken for early progress would not scale.

He died at Stanford on October 24, 2011, at eighty-four. The deep learning revolution that has since transformed AI — producing systems that can write text, recognize images, play Go, and fold proteins — is built almost entirely on the statistical, subsymbolic methods he spent much of his career questioning. Whether those methods constitute genuine intelligence, or sophisticated pattern matching that lacks the explicit reasoning he believed essential, remains contested.

Dead End: The Lisp Machine

In the late 1970s, MIT and other AI labs developed Lisp Machines — specialized hardware designed to run LISP programs efficiently. LISP’s requirements — automatic garbage collection, tagged data types, support for recursive list processing — were expensive to implement in software on general-purpose hardware. Dedicated hardware could provide these operations at the instruction level.

The commercial Lisp machine manufacturers — Symbolics and LMI (Lisp Machines, Inc.) — sold specialized workstations to AI labs and corporations in the early 1980s at prices of $70,000 to $100,000 per machine. The machines were technically excellent. They were also commercially fragile.

Warnung

The 1987 AI winter destroyed the customer base for Lisp Machines precisely as general-purpose workstations were improving rapidly enough to close the performance gap. Sun SPARC and Apollo workstations could run Common LISP compilers that performed comparably to specialized LISP hardware at a fraction of the cost. The fundamental economics of Moore’s Law applied to general-purpose hardware more forcefully than to specialized hardware: a SPARC workstation that cost $10,000 in 1988 was as fast as a Symbolics machine that cost $70,000. Symbolics filed for bankruptcy protection in 1993. The Lisp Machine became the canonical example of a specialized hardware platform being overwhelmed by the improving general-purpose alternative — a lesson that has recurred in computing history.

The broader development of LISP and functional programming is explored in The Rise of Functional Programming. The AI tradition McCarthy founded is traced in The Rise of Artificial Intelligence.