MD5 Cracked in Under One Second: The Checksum Used as a Lock
Zusammenfassung
MD5, a cryptographic hash function created by Ron Rivest in 1991, can be cracked for a typical user password in under one second on modern GPU hardware — specifically, in under one millisecond for common passwords using lookup tables (rainbow tables). MD5 was not designed for password storage; it was designed for integrity verification of files and messages. Using it for passwords was a category error that millions of websites made through the 2000s and 2010s. When LinkedIn’s database was breached in 2012, most of the 117 million password hashes were cracked within days because they were stored as unsalted SHA-1 hashes — SHA-1, like MD5, is a fast general-purpose hash that was never meant for password storage.
What MD5 Was Designed For
Ron Rivest published MD5 (Message Digest Algorithm 5) in 1991 as a replacement for MD4, which had been found to have weaknesses. MD5 takes an input of any length and produces a fixed 128-bit (16-byte) hash — a compact fingerprint of the input. The same input always produces the same hash; any change to the input produces a completely different hash.
Intended applications:
- File integrity: Verify that a downloaded file has not been corrupted or tampered with. The publisher provides an MD5 hash; you compute it after download; if they match, the file is intact.
- Digital signatures: Hash a document, then sign the hash cryptographically. Signing the hash is much faster than signing the full document.
- Checksums: Detect accidental data corruption in storage or transmission.
MD5 was not designed to resist brute-force attacks against its inputs. For file integrity purposes, computing a hash quickly is desirable — the faster the hash, the faster you can verify large files. For password storage, computing a hash quickly is catastrophic — the faster the hash, the more password guesses an attacker can make per second.
The Password Storage Catastrophe
Web developers in the 1990s-2000s needed to store user passwords. Storing them in plaintext was obviously wrong (any database breach would expose all passwords). Hashing them seemed right. MD5 was the most familiar hash function. The result: millions of websites stored passwords as MD5 hashes.
A modern GPU can compute approximately 10 billion MD5 hashes per second. If an attacker has a database of MD5 hashes and wants to find the original passwords:
- Testing every 8-character lowercase password: ~10 seconds
- Testing a precomputed rainbow table of common passwords: milliseconds
- Testing the entire RockYou wordlist (32 million common passwords): effectively instant
The LinkedIn breach (2012) exposed 117 million unsalted SHA-1 password hashes — the same category error as MD5, since SHA-1 is also a fast hash built for integrity rather than password storage. Most were cracked within days by security researchers using GPU clusters and public wordlists.
The Correct Approach
Password hashing requires specialized algorithms designed to be deliberately slow:
- bcrypt (1999): Configurable cost factor; scales with hardware.
- scrypt (2009): Memory-hard; resists GPU attacks.
- Argon2 (2015): Winner of the Password Hashing Competition; resistant to both GPU and ASIC attacks.
These algorithms hash passwords thousands or millions of times in sequence, making each password verification take milliseconds — imperceptible to users, but reducing attacker throughput from billions of guesses per second to thousands. The cybersecurity field’s guidance on password storage has recommended against MD5 since at least 2004.
📚 Sources
- Rivest, Ron: “The MD5 Message-Digest Algorithm” — RFC 1321, April 1992
- Provos, Niels & Mazières, David: “A Future-Adaptable Password Scheme” — USENIX 1999 (bcrypt)
- Stevens, Marc et al.: “The First Collision for Full SHA-1” — CWI Amsterdam, 2017 (context for hash function security)
- “2012 LinkedIn hack” — Wikipedia (unsalted SHA-1 hashes, 117 million accounts)