Larry Wall and Perl

Zusammenfassung

Larry Wall created Perl in 1987 for a specific, boring task: processing log files on a network of Unix machines at NASA’s JPL. He was a linguist who had ended up in computing, and Perl reflected this: a language that treated text as its primary data type, borrowed syntax from multiple sources without apology, and was designed around the idea that programmers — like speakers of natural languages — should have multiple ways to express the same thought. Perl became the dominant language for system administration, text processing, CGI web scripting, and bioinformatics. It was also famously difficult to read. Its motto — “There Is More Than One Way To Do It” — was a design philosophy that Perl’s users loved and Perl’s critics considered an explanation for why Perl code was illegible.

A Linguist Among Programmers

Larry Wall was born on September 27, 1954, in Los Angeles, California. He studied at Seattle Pacific University, graduating in 1976 with a self-designed degree in natural and artificial languages combining linguistics, computer science, and related fields. He pursued graduate work in linguistics at UC Berkeley and UCLA, intending to find an unwritten language, create a writing system for it, and translate the Bible into it. Those plans were abandoned for health reasons; he ended up working as a programmer.

His linguistics background was not incidental to Perl’s design. Wall thought about programming languages the way a linguist thought about natural languages: as evolving, pragmatic systems that accumulated forms over time, where redundancy was a feature rather than a bug, and where the “correct” form was whatever communicated intent most clearly in context.

He worked at Unisys (BURROUGHS), NASA’s JPL, and the System Development Corporation, writing tools for software teams. In 1986, he wrote patch — the program that applies diff files to source code — which became a standard Unix tool. In 1987, he was managing a network of Unix machines and needed to process large log files and generate reports. The available tools — awk, sed, and shell scripting — were insufficient for the complexity of the task. He wrote a new tool.

Perl: The Swiss Army Chainsaw

Perl 1.0 was released on December 18, 1987. The name was a backronym: Practical Extraction and Report Language, though Wall also suggested “Pathologically Eclectic Rubbish Lister” as an alternative.

Perl’s design borrowed liberally from multiple sources without trying to reconcile them:

AWK: record-based processing, associative arrays, field splitting
Sed: regular expression substitution syntax
Shell: string interpolation, quoting conventions, backtick command execution
C: syntax, operators, control structures
LISP: list processing, list context vs. scalar context

The result was a language that could express text processing tasks concisely — sometimes in single lines that compressed enormous functionality — but that required deep familiarity to read:

# Perl — concise text processing, but dense
while (<>) {
    chomp;
    s/foo/bar/g;      # regex substitution
    print if /baz/;   # print if line matches
}

# One-liner: sort unique lines of a file
perl -ne 'print unless $seen{$_}++'

# TMTOWTDI: multiple ways to iterate a list
for my $item (@list) { ... }    # C-style for
foreach (@list) { ... }          # foreach
map { ... } @list;               # functional map

Context was a central concept: an expression had different values depending on whether it appeared in a scalar context (expecting a single value) or a list context (expecting multiple values). An array @arr in scalar context returned its count; in list context, its elements. This made Perl concise but unpredictable for programmers accustomed to languages where expressions had one value.

Regular expressions were first-class syntax in Perl — not a library to call but built-in operators. Perl’s regex implementation added features beyond POSIX regular expressions: lookahead, lookbehind, non-greedy quantifiers, named captures. The power this provided for text processing made Perl the default language for bioinformatics, log analysis, and early web scripting.

The CGI Era and Bioinformatics

In the mid-1990s, the web’s Common Gateway Interface (CGI) allowed web servers to execute scripts and return dynamic HTML. Perl became the dominant language for CGI scripts — every form submission, every dynamic web page, every server-side web application before PHP gained traction was likely Perl. The CGI.pm module (Lincoln Stein, 1995) made Perl/CGI development straightforward.

Bioinformatics — the computational analysis of biological data — adopted Perl as its primary tool. Genome sequences were large text files; Perl’s text processing and regular expressions were perfectly suited for parsing and analyzing them. The Human Genome Project generated data that bioinformaticians processed primarily in Perl. BioPerl (1995) became the standard library for biological sequence analysis.

Perl 5 (1994) added object-oriented features — references, blessed references, method calls — that made larger programs more structured. The CPAN (Comprehensive Perl Archive Network, online since October 1995) became a repository of reusable modules covering virtually every task a programmer might need: web scraping, database access, email processing, XML parsing. CPAN’s scope and quality was a major factor in Perl’s dominance through the 1990s.

The Decline and Perl 6

The late 1990s brought competitors. PHP, designed specifically for web development, was easier to embed in HTML than Perl CGI. Python, with its cleaner syntax and explicit style, attracted programmers who valued readability. Ruby, released in 1995, offered similar expressive power to Perl with a cleaner object model.

Wall announced Perl 6 in 2000 as a ground-up redesign of the language. The project was enormously ambitious — a complete rewrite that would fix Perl’s accumulated inconsistencies while preserving its expressive power. It took nineteen years. Perl 6 (later renamed Raku in 2019, to avoid confusion with Perl 5) was released in December 2015. It was a different language from Perl 5 — not a successor but a sibling — and never achieved significant adoption.

Perl 5 continued as a maintained language, with a development community committed to compatibility and incremental improvement. It remained dominant in bioinformatics and system administration tasks where its text processing capabilities had no equivalent. But the web scripting market, the CGI market, and the general scripting market had moved to PHP, Python, and Ruby.

Dead End: TIMTOWTDI at Scale

Perl’s motto — “There Is More Than One Way To Do It” — was a genuine design philosophy and a genuine maintenance problem.

Readability Costs

A Perl program written by one programmer was often illegible to another, because each had chosen different idioms from the many available. Code review, pair programming, and collective code ownership — the practices that made large software teams productive — were harder in Perl than in languages with a preferred way to express common patterns. The famous “write-only” reputation of Perl code was an exaggeration of a real phenomenon: Perl rewarded expertise with expressive power, but penalized unfamiliarity with opacity.

Python’s explicit design counter-principle — “There should be one obvious way to do it” — was a direct response to Perl’s approach. The industry’s movement toward Python for large-scale scripting and data science was partly a vote for readability as a first-order language value.

Perl’s place in the evolution of scripting languages is covered in The Evolution of Language.