Unpopular Opinion: A simple guide to statistics in the age of deception

Tim Harford in The Financial Times

Image result for statistics

“The best financial advice for most people would fit on an index card.” That’s the gist of an offhand comment in 2013 by Harold Pollack, a professor at the University of Chicago. Pollack’s bluff was duly called, and he quickly rushed off to find an index card and scribble some bullet points — with respectable results.

When I heard about Pollack’s notion — he elaborated upon it in a 2016 book — I asked myself: would this work for statistics, too? There are some obvious parallels. In each case, common sense goes a surprisingly long way; in each case, dizzying numbers and impenetrable jargon loom; in each case, there are stubborn technical details that matter; and, in each case, there are people with a sharp incentive to lead us astray.

The case for everyday practical numeracy has never been more urgent. Statistical claims fill our newspapers and social media feeds, unfiltered by expert judgment and often designed as a political weapon. We do not necessarily trust the experts — or more precisely, we may have our own distinctive view of who counts as an expert and who does not.

Nor are we passive consumers of statistical propaganda; we are the medium through which the propaganda spreads. We are arbiters of what others will see: what we retweet, like or share online determines whether a claim goes viral or vanishes. If we fall for lies, we become unwittingly complicit in deceiving others. On the bright side, we have more tools than ever to help weigh up what we see before we share it — if we are able and willing to use them.

In the hope that someone might use it, I set out to write my own postcard-sized citizens’ guide to statistics. Here’s what I learnt.

Professor Pollack’s index card includes advice such as “Save 20 per cent of your money” and “Pay your credit card in full every month”. The author Michael Pollan offers dietary advice in even pithier form: “Eat Food. Not Too Much. Mostly Plants.” Quite so, but I still want a cheeseburger.

However good the advice Pollack and Pollan offer, it’s not always easy to take. The problem is not necessarily ignorance. Few people think that Coca-Cola is a healthy drink, or believe that credit cards let you borrow cheaply. Yet many of us fall into some form of temptation or other. That is partly because slick marketers are focused on selling us high-fructose corn syrup and easy credit. And it is partly because we are human beings with human frailties.

With this in mind, my statistical postcard begins with advice about emotion rather than logic. When you encounter a new statistical claim, observe your feelings. Yes, it sounds like a line from Star Wars, but we rarely believe anything because we’re compelled to do so by pure deduction or irrefutable evidence. We have feelings about many of the claims we might read — anything from “inequality is rising” to “chocolate prevents dementia”. If we don’t notice and pay attention to those feelings, we’re off to a shaky start.

What sort of feelings? Defensiveness. Triumphalism. Righteous anger. Evangelical fervour. Or, when it comes to chocolate and dementia, relief. It’s fine to have an emotional response to a chart or shocking statistic — but we should not ignore that emotion, or be led astray by it.

There are certain claims that we rush to tell the world, others that we use to rally like-minded people, still others we refuse to believe. Our belief or disbelief in these claims is part of who we feel we are. “We all process information consistent with our tribe,” says Dan Kahan, professor of law and psychology at Yale University.

In 2005, Charles Taber and Milton Lodge, political scientists at Stony Brook University, New York, conducted experiments in which subjects were invited to study arguments around hot political issues. Subjects showed a clear confirmation bias: they sought out testimony from like-minded organisations. For example, subjects who opposed gun control would tend to start by reading the views of the National Rifle Association. Subjects also showed a disconfirmation bias: when the researchers presented them with certain arguments and invited comment, the subjects would quickly accept arguments with which they agreed, but devote considerable effort to disparage opposing arguments.

Expertise is no defence against this emotional reaction; in fact, Taber and Lodge found that better-informed experimental subjects showed stronger biases. The more they knew, the more cognitive weapons they could aim at their opponents. “So convenient a thing it is to be a reasonable creature,” commented Benjamin Franklin, “since it enables one to find or make a reason for everything one has a mind to do.”

This is why it’s important to face up to our feelings before we even begin to process a statistical claim. If we don’t at least acknowledge that we may be bringing some emotional baggage along with us, we have little chance of discerning what’s true. As the physicist Richard Feynman once commented, “You must not fool yourself — and you are the easiest person to fool.”

The second crucial piece of advice is to understand the claim. That seems obvious. But all too often we leap to disbelieve or believe (and repeat) a claim without pausing to ask whether we really understand what the claim is. To quote Douglas Adams’s philosophical supercomputer, Deep Thought, “Once you know what the question actually is, you’ll know what the answer means.”

For example, take the widely accepted claim that “inequality is rising”. It seems uncontroversial, and urgent. But what does it mean? Racial inequality? Gender inequality? Inequality of opportunity, of consumption, of education attainment, of wealth? Within countries or across the globe?

Even given a narrower claim, “inequality of income before taxes is rising” (and you should be asking yourself, since when?), there are several different ways to measure this. One approach is to compare the income of people at the 90th percentile and the 10th percentile, but that tells us nothing about the super-rich, nor the ordinary people in the middle. An alternative is to examine the income share of the top 1 per cent — but this approach has the opposite weakness, telling us nothing about how the poorest fare relative to the majority.

There is no single right answer — nor should we assume that all the measures tell a similar story. In fact, there are many true statements that one can make about inequality. It may be worth figuring out which one is being made before retweeting it.

Perhaps it is not surprising that a concept such as inequality turns out to have hidden depths. But the same holds true of more tangible subjects, such as “a nurse”. Are midwives nurses? Health visitors? Should two nurses working half-time count as one nurse? Claims over the staffing of the UK’s National Health Service have turned on such details.

All this can seem like pedantry — or worse, a cynical attempt to muddy the waters and suggest that you can prove anything with statistics. But there is little point in trying to evaluate whether a claim is true if one is unclear what the claim even means.

Imagine a study showing that kids who play violent video games are more likely to be violent in reality. Rebecca Goldin, a mathematician and director of the statistical literacy project STATS, points out that we should ask questions about concepts such as “play”, “violent video games” and “violent in reality”. Is Space Invaders a violent game? It involves shooting things, after all. And are we measuring a response to a questionnaire after 20 minutes’ play in a laboratory, or murderous tendencies in people who play 30 hours a week? “Many studies won’t measure violence,” says Goldin. “They’ll measure something else such as aggressive behaviour.” Just like “inequality” or “nurse”, these seemingly common sense words hide a lot of wiggle room.

Two particular obstacles to our understanding are worth exploring in a little more detail. One is the question of causation. “Taller children have a higher reading age,” goes the headline. This may summarise the results of a careful study about nutrition and cognition. Or it may simply reflect the obvious point that eight-year-olds read better than four-year-olds — and are taller. Causation is philosophically and technically a knotty business but, for the casual consumer of statistics, the question is not so complicated: just ask whether a causal claim is being made, and whether it might be justified.

Returning to this study about violence and video games, we should ask: is this a causal relationship, tested in experimental conditions? Or is this a broad correlation, perhaps because the kind of thing that leads kids to violence also leads kids to violent video games? Without clarity on this point, we don’t really have anything but an empty headline.

We should never forget, either, that all statistics are a summary of a more complicated truth. For example, what’s happening to wages? With tens of millions of wage packets being paid every month, we can only ever summarise — but which summary? The average wage can be skewed by a small number of fat cats. The median wage tells us about the centre of the distribution but ignores everything else.

Or we might look at the median increase in wages, which isn’t the same thing as the increase in the median wage — not at all. In a situation where the lowest and highest wages are increasing while the middle sags, it’s quite possible for the median pay rise to be healthy while median pay falls.

Sir Andrew Dilnot, former chair of the UK Statistics Authority, warns that an average can never convey the whole of a complex story. “It’s like trying to see what’s in a room by peering through the keyhole,” he tells me.

In short, “you need to ask yourself what’s being left out,” says Mona Chalabi, data editor for The Guardian US. That applies to the obvious tricks, such as a vertical axis that’s been truncated to make small changes look big. But it also applies to the less obvious stuff — for example, why does a graph comparing the wages of African-Americans with those of white people not also include data on Hispanic or Asian-Americans? There is no shame in leaving something out. No chart, table or tweet can contain everything. But what is missing can matter.

Channel the spirit of film noir: get the backstory. Of all the statistical claims in the world, this particular stat fatale appeared in your newspaper or social media feed, dressed to impress. Why? Where did it come from? Why are you seeing it?

Sometimes the answer is little short of a conspiracy: a PR company wanted to sell ice cream, so paid a penny-ante academic to put together the “equation for the perfect summer afternoon”, pushed out a press release on a quiet news day, and won attention in a media environment hungry for clicks. Or a political donor slung a couple of million dollars at an ideologically sympathetic think-tank in the hope of manufacturing some talking points.

Just as often, the answer is innocent but unedifying: publication bias. A study confirming what we already knew — smoking causes cancer — is unlikely to make news. But a study with a surprising result — maybe smoking doesn’t cause cancer after all — is worth a headline. The new study may have been rigorously conducted but is probably wrong: one must weigh it up against decades of contrary evidence.

Publication bias is a big problem in academia. The surprising results get published, the follow-up studies finding no effect tend to appear in lesser journals if they appear at all. It is an even bigger problem in the media — and perhaps bigger yet in social media. Increasingly, we see a statistical claim because people like us thought it was worth a Like on Facebook.

David Spiegelhalter, president of the Royal Statistical Society, proposes what he calls the “Groucho principle”. Groucho Marx famously resigned from a club — if they’d accept him as a member, he reasoned, it couldn’t be much of a club. Spiegelhalter feels the same about many statistical claims that reach the headlines or the social media feed. He explains, “If it’s surprising or counter-intuitive enough to have been drawn to my attention, it is probably wrong.”

OK. You’ve noted your own emotions, checked the backstory and understood the claim being made. Now you need to put things in perspective. A few months ago, a horrified citizen asked me on Twitter whether it could be true that in the UK, seven million disposable coffee cups were thrown away every day.

I didn’t have an answer. (A quick internet search reveals countless repetitions of the claim, but no obvious source.) But I did have an alternative question: is that a big number? The population of the UK is 65 million. If one person in 10 used a disposable cup each day, that would do the job.

Many numbers mean little until we can compare them with a more familiar quantity. It is much more informative to know how many coffee cups a typical person discards than to know how many are thrown away by an entire country. And more useful still to know whether the cups are recycled (usually not, alas) or what proportion of the country’s waste stream is disposable coffee cups (not much, is my guess, but I may be wrong).

So we should ask: how big is the number compared with other things I might intuitively understand? How big is it compared with last year, or five years ago, or 30? It’s worth a look at the historical trend, if the data are available.

Finally, beware “statistical significance”. There are various technical objections to the term, some of which are important. But the simplest point to appreciate is that a number can be “statistically significant” while being of no practical importance. Particularly in the age of big data, it’s possible for an effect to clear this technical hurdle of statistical significance while being tiny.

One study was able to demonstrate that unborn children exposed to a heatwave while in the womb went on to earn less as adults. The finding was statistically significant. But the impact was trivial: $30 in lost income per year. Just because a finding is statistically robust does not mean it matters; the word “significance” obscures that.

In an age of computer-generated images of data clouds, some of the most charming data visualisations are hand-drawn doodles by the likes of Mona Chalabi and the cartoonist Randall Munroe. But there is more to these pictures than charm: Chalabi uses the wobble of her pen to remind us that most statistics have a margin of error. A computer plot can confer the illusion of precision on what may be a highly uncertain situation.

“It is better to be vaguely right than exactly wrong,” wrote Carveth Read in Logic (1898), and excessive precision can lead people astray. On the eve of the US presidential election in 2016, the political forecasting website FiveThirtyEight gave Donald Trump a 28.6 per cent chance of winning. In some ways that is impressive, because other forecasting models gave Trump barely any chance at all. But how could anyone justify the decimal point on such a forecast? No wonder many people missed the basic message, which was that Trump had a decent shot. “One in four” would have been a much more intuitive guide to the vagaries of forecasting.

Exaggerated precision has another cost: it makes numbers needlessly cumbersome to remember and to handle. So, embrace imprecision. The budget of the NHS in the UK is about £10bn a month. The national income of the United States is about $20tn a year. One can be much more precise about these things, but carrying the approximate numbers around in my head lets me judge pretty quickly when — say — a £50m spending boost or a $20bn tax cut is noteworthy, or a rounding error.

My favourite rule of thumb is that since there are 65 million people in the UK and people tend to live a bit longer than 65, the size of a typical cohort — everyone retiring or leaving school in a given year — will be nearly a million people. Yes, it’s a rough estimate — but vaguely right is often good enough.

Be curious. Curiosity is bad for cats, but good for stats. Curiosity is a cardinal virtue because it encourages us to work a little harder to understand what we are being told, and to enjoy the surprises along the way.

This is partly because almost any statistical statement raises questions: who claims this? Why? What does this number mean? What’s missing? We have to be willing — in the words of UK statistical regulator Ed Humpherson — to “go another click”. If a statistic is worth sharing, isn’t it worth understanding first? The digital age is full of informational snares — but it also makes it easier to look a little deeper before our minds snap shut on an answer.

While curiosity gives us the motivation to ask another question or go another click, it gives us something else, too: a willingness to change our minds. For many of the statistical claims that matter, we have already reached a conclusion. We already know what our tribe of right-thinking people believe about Brexit, gun control, vaccinations, climate change, inequality or nationalisation — and so it is natural to interpret any statistical claim as either a banner to wave, or a threat to avoid.

Curiosity can put us into a better frame of mind to engage with statistical surprises. If we treat them as mysteries to be resolved, we are more likely to spot statistical foul play, but we are also more open-minded when faced with rigorous new evidence.

In research with Asheley Landrum, Katie Carpenter, Laura Helft and Kathleen Hall Jamieson, Dan Kahan has discovered that people who are intrinsically curious about science — they exist across the political spectrum — tend to be less polarised in their response to questions about politically sensitive topics. We need to treat surprises as a mystery rather than a threat.

Isaac Asimov is thought to have said, “The most exciting phrase in science isn’t ‘Eureka!’, but ‘That’s funny…’” The quip points to an important truth: if we treat the open question as more interesting than the neat answer, we’re on the road to becoming wiser.

In the end, my postcard has 50-ish words and six commandments. Simple enough, I hope, for someone who is willing to make an honest effort to evaluate — even briefly — the statistical claims that appear in front of them. That willingness, I fear, is what is most in question.

“Hey, Bill, Bill, am I gonna check every statistic?” said Donald Trump, then presidential candidate, when challenged by Bill O’Reilly about a grotesque lie that he had retweeted about African-Americans and homicides. And Trump had a point — sort of. He should, of course, have got someone to check a statistic before lending his megaphone to a false and racist claim. We all know by now that he simply does not care.

But Trump’s excuse will have struck a chord with many, even those who are aghast at his contempt for accuracy (and much else). He recognised that we are all human. We don’t check everything; we can’t. Even if we had all the technical expertise in the world, there is no way that we would have the time.

My aim is more modest. I want to encourage us all to make the effort a little more often: to be open-minded rather than defensive; to ask simple questions about what things mean, where they come from and whether they would matter if they were true. And, above all, to show enough curiosity about the world to want to know the answers to some of these questions — not to win arguments, but because the world is a fascinating place.

Unpopular Opinion

Search This Blog

Thursday, 8 February 2018

A simple guide to statistics in the age of deception

No comments:

Post a Comment