How to Lie with Statistics
The truly genuine problem...does not consist of proving something false but in proving that the authentic object is authentic.
You know the old story about storks delivering babies? It's true. I can prove it with statistics. Take a look at the estimated population of storks in each country, and then at the number of babies born each year. Across Europe, there's a remarkably strong relationship. More storks, more babies; fewer storks, fewer babies.
The pattern is easily strong enough to pass a traditional hurdle for publication in an academic journal. In fact, a scientific paper has been published with the title "Storks Deliver Babies (p = 0.008)." Without getting too technical, all those zeros tell us that this is not a coincidence.
Perhaps you have already guessed the trick. Large European countries such as Germany, Poland, and Turkey are home to many babies and many storks. Small countries such as Albania and Denmark have few babies and few storks. While there's a clear pattern in the data, that pattern does not mean that storks cause babies to appear.
You can "prove" anything with statistics, it seems—even that storks deliver babies.
You'd certainly have gotten that impression from reading How to Lie with Statistics. Published in 1954 by a little-known American freelance journalist named Darrell Huff, this wisecracking, cynical little book immediately received a rave review from the New York Times and went on to become perhaps the most popular book on statistics ever published, selling well over a million copies.
The book deserves the popularity and the praise. It's a marvel of statistical communication. It also made Darrell Huff a nerd legend. Ben Goldacre, an epidemiologist and bestselling author of Bad Science, has written admiringly of how "The Huff" had written a "ripper." The American writer Charles Wheelan describes his book Naked Statistics as "an homage" to Huff's "classic." The respected journal Statistical Science organized a Huff retrospective fifty years after its publication.
I used to feel the same way. As a teenager, I loved reading How to Lie with Statistics. Bright, sharp, and illustrated throughout with playful cartoons, the book gave me a peek behind the curtain of statistical manipulation, showing me how the swindling was done so that I would not be fooled again.
Huff is full of examples. He begins by pondering how much money Yale graduates make. According to a 1950 survey, the class of 1924 had an average income of close to $500,000 a year in today's terms. That is just plausible enough to believe—this is Yale, after all—but half a million dollars a year is a lot of money. Is that really the average?
No. Huff explains that this "improbably salubrious" figure comes from self-reported data, which means we can expect people to exaggerate their income for the sake of vanity. Furthermore, the survey is only of people who bothered to respond—and only those alumni Yale could find. And who are easily found? The rich and famous. "Who are the little lost sheep down in the Yale rolls as address unknown?" asks Huff. Yale will keep track of the millionaire alumni, but some of the also-ran graduates might easily have slipped through the net. All this means that the survey will present a grossly inflated view.
Huff briskly moves on through a vast range of statistical crimes, from toothpaste advertisements based on cherry-picked research to maps that change their meaning depending on how you color them in. As Huff wrote, "The crooks already know these tricks; honest men must learn them in self-defense."
If you read How to Lie with Statistics, you will come away more skeptical about the ways numbers can deceive you. It's a clever and instructive book.
But I've spent more than a decade trying to communicate statistical ideas and fact-check numerical claims—and over the years, I've become more and more uneasy about How to Lie with Statistics and what that little book represents. What does it say about statistics— and about us—that the most successful book on the subject is, from cover to cover, a warning about misinformation?