How Samuel Morse Hacked English Letter Frequency
E is one dot. Q is dash-dash-dot-dash. The shortest signals were handed to the letters you'd type the most.
When Samuel Morse and Alfred Vail were finalizing their telegraph code in the early 1840s, they walked into the office of a Morristown, New Jersey newspaper and counted the letters in the typesetters' cases. Printers stocked more E's than any other letter; Vail noticed they kept fewer Z's than almost anything. He used those counts to assign code lengths.
The letters that come up most often in English got the shortest signals. E is a single dot. T is a single dash. A is dot-dash. The rare ones got long, awkward sequences: Q is dash-dash-dot-dash, J is dot-dash-dash-dash. A trained operator could send common words at impressive speed because the common letters were cheap to type.
This is, almost a century early, what Claude Shannon would formalize as the principle behind variable-length compression codes — give frequent symbols short representations, rare symbols long ones, and the average message gets shorter. ZIP files and JPEGs run on the same idea. Morse and Vail beat them to it by counting type in a print shop.
The code we still call Morse today isn't quite Morse's original. The 1840s American version had irregular gaps inside some letters. A 1865 international standard cleaned it up, and that revised version — sometimes called Continental Morse — is what amateur radio operators use today and what NATO ships still trained sailors on into the 21st century.
Every message reduced to two symbols and a system of timing — a digital code, decades before the word "digital" meant anything in engineering.
Make Recess yours.
Sign in to save the ones you loved, never see the same thing twice, and tell us what you want more of.