What would your reaction have been if you had read the following sentence in a 1995 newspaper article: “Every year since 1950, the number of American children gunned down has doubled”? You probably would have been shocked, but would you have spent that much time thinking about the reliability of such an impressive “statistic”? For if you had believed it, as one often does when reading such headlines, you would have been making a big mistake. Suppose that in 1950 only one child in America was shot dead, then doubling this number 45 times, you would reach a number of 35 trillion children gunned down in 1995.
Fortunately, this is an implausible figure. This widely quoted example is recalled by Joel Best in his most interesting book, Damned Lies and Statistics, a title that itself was inspired by a remark that some accord to Disraeli, others to Mark Twain. The example is even more pertinent insofar as it is based on a syntactic mistake. In fact, the correct wording in the original source was “The number of American children killed each year by guns has doubled since 1950” and it was the misquote that disseminated the “false statistic”.
The example points to several other lessons, too. For a start, data (i.e., a numeric value) without appropriate metadata (i.e. information about the meaning of data) do not give any meaningful information. Also, where explanatory metadata is provided, the attention paid by media and other users to it is frequently much lower than the attention they pay to the data itself. This means a “trained” brain may indeed be necessary to be able to assess the reliability of a statistical value.
Unfortunately, this creates a general attitude of leaving to media and other experts the role of selecting the “correct statistical information”, with the userpublic left to judge the credibility of the source, assuming it has the tools to do so. In other words, the layman’s capacity to evaluate the correctness of a statistic is quite limited. Nor can we pretend that each citizen should become a professional statistician by being able to interpret correctly the growing amount of data quoted by the media or available on the Internet.
Therefore, we normally have to rely on somebody else to gather and “read” statistical information for us. And for that, we tend to trust public sources. Yet, there are cases in which we think we are able to measure economic or social phenomena much better than statisticians.
Let’s take the case of inflation. Day after day we play our role of consumers. In doing so, we deal with a large range of prices and we normally compare these prices both between different stores and over time. Does this daily activity give us the capacity to evaluate inflation trends correctly? Most of us would answer, “Absolutely!”. And when official figures (normally based on thousands of observed prices for a large basket of goods and services) are released by the national statistical office, we consumers can then judge whether they are correct, excessive or undershooting, right? No doubt the popular media will exploit public sentiment and at least one headline will trumpet the quip about lies.
Statisticians will say that consumers were mistaken, of course. But are they right? It is often a matter of perception. Consider what happened in 2002 when 12 countries adopted the euro. Several consumers’ associations queried the “true” inflation rate, arguing that the impact of the new currency was to push prices upward. The euro was introduced in January, when the official increase in consumer prices since the same month of 2001 was already around 2.5%. During the spring, newspapers and other media began highlighting that some items (mainly food) were showing large price increases, due to the “rounding up” effect, or to the speculative behaviour of retailers. In some countries “private” surveys – many of which were more keen on scoring political points than explaining the underlying truth – were hastily launched (sometimes only covering food items) and they showed that the “true” inflation rate was much higher than the official one (calculated by Eurostat), at 1.8% in June and 2.1% in September.
Curiously, very seldom were the methods of these private evaluations scrutinised by the media, and certainly not to the extent that official figures have been. The main reason for these divergences is that consumers buy certain products daily or weekly, like food and clothes, and other products very rarely, if at all, like televisions, computers and cars. Therefore, if there are very different trends in prices for various groups of products, it can be difficult to evaluate the overall trend. And if households with different income levels buy different baskets of good and services, and if the distribution of price movements is heterogeneous (with large increases on “basic” items and price-falls on hi-fis and computers, etc.), then the inflation rate as perceived by various groups of households will be very different.
In addition, consumers often have trouble evaluating the effect on prices of the quality of goods and services, in particular products with a high technological content, like cellular phones, for which prices have dropped sharply since the start of the 1990s. Unlike consumers, national statistical offices spend resources adjusting price movements for these quality changes. Perhaps it is technical adjustments like these that lay statisticians open to accusations of “massaging” the inflation figures for political ends. Frankly, such trickery is unlikely, given the large number of people involved in compiling consumer price statistics in each country.
And anyway, there is a very long tradition of strong political independence in almost all European statistical offices. Competition from other (often private) sources of information poses another problem for official statisticians. First, it makes the job of identifying reliable sources that much harder. This is particularly true of social statistics, and in overall measures of global well-being. Policymakers and the general public often like to use comparative indicators to measure if their situation is improving or worsening, or better than in other countries. Thus, newspapers and journals are full of reports about indicators on well-being, competitiveness, development, etc., where several time series are combined to derive a single measure and performances are ranked in ascending or descending order. But social indicators are really too complex for this kind of reductionist treatment. At the very least, caveats should be made clear, which is too often not the case.
Such ranking of indicators may be appealing, but they are subjective, reflecting only the compiler’s view of the relative importance of the various aspects taken into account. On the other hand, statisticians have to respond to user needs, by finding correct and transparent ways to measure factors that contribute to the well-being of our societies, using a variety of tools, including reliable and explainable indicators.
So, while there are cases where statistics and common sense can diverge, unraveling the truth and explaining it accurately is exactly the role of official statistics. Making an effort to communicate results is very much part of the package, too. Only by bridging the gap between the complexity of statistical methods, and the needs of users in having clear and understandable results, can official statistics be seen by public opinion as the most trustworthy source of information.
But if statisticians ignore these pressures, then people will continue to base their judgements on unreliable figures or indicators. In doing so, they will become even more convinced that all statistics, whatever their source, are beneath even the damnedest of lies.
Best, J. (2001), Damned Lies and Statistics. Untangling Numbers from the Media, Politicians and Activist, University of California Press.
©OECD Observer No 237, May 2003