The Problems with Statistics

by

Richard F. Taflinger

This page has been accessed times since 5 June 1996.

This is Part Three of a Four Part series on finding support for papers and speeches.

Statistics

Things to Consider about Statistics

Who did the study?
What are the statistics measuring?
Who was asked?
How were they asked?
Compared with what?

Another form of evidence is statistics. Statistics are a favorite evidence of many writers and speakers. They provide actual numbers in support of ideas and conclusions. If you can show that 75% of high schools seniors cannot find Washington State on a map of North America, then it is strong evidence for your contention that high school seniors are not being taught the geography of the United States. Such evidence is not only difficult to refute, it's often accepted as the final word in what's true or not true.

Statistics are a prime source of proof that what you say is true. Statistics are based on studies: a search for possible connections between disparate facts that nonetheless have a connection. If you remember your math classes, you will recall the concept of sets and subsets. Statistics are, in large measure, concerned with that concept. They are basically telling you the proportion a subset represents in a set. To clarify this idea, look at political polls. Candidate A receives 46% approval, Candidate B receives 43% approval. Thus, the subset "responses favoring Candidate A" is 46% of the whole set, "People asked about Candidates A and B."

Another example, from real life. William Chadwick, with his assistant William Farr, during the great cholera plague in London in 1831, drew together factors on who was getting the disease and where they were getting it in London. They were looking for some common factor that would lead to what was the source of the disease. Their statistics led them to the conclusion that the polluted waters of the Thames River was the source, and there was a particular pump that supplied the water to certain neighborhoods that was a prime source of infection. With these data they were able to make recommendations which did much to reduce the incidence of cholera in London.

Statistics also use samples to obtain results, rather than doing actual "head counts". Neilson ratings on how many of what kind of people watch a particular TV program is not determined by the Neilson company asking all 300 million people in the United States what they are watching every few minutes. What they use is a sample of the population (called the Neilson families) that, demographically, represent the 300 million people. Neilson selects these families very carefully since each one represents the viewing habits and desires of some 60,000 people. Nonetheless the statistics generated by the Neilson measurements are used to make programming decisions and set advertising rates and budgets, things that represent billions of dollars. Thus the selection of the sample, whether Neilson's or incidence of AIDS in the US population, is of paramount importance in the validity of the statistics thus generated.

The above is, of course, a simplistic view of an extremely complicated discipline. It is, nonetheless, the essence of statistics.

Statistics are invaluable as evidence in support of conclusions. If you can either find or generate statistics that show the truth of your conclusions, there are few that would refute your ideas.

There are, of course, problems with using statistics as evidence. Let me remind you of a famous saying: "There are three ways to not tell the truth: lies, damned lies, and statistics." What you must do is ask yourself some questions: who did the study that came up with the statistics, what exactly are the statistics measuring, who was asked, how were they asked, and compared with what? If one believes in the truth of statistics (and there are many such), then how does one explain that the same Presidential candidate can be 20 points ahead and 5 points behind his opponent in the polls at the same time? After all, both polls are "statistics". What you must be examine, if you wish to use statistics as evidence, are the above questions.

Who Did the Study

Let us examine first "who did the study." We live in a world of statistics: you can find numbers in support of just about any idea. The problem arises when you find statistics that support every way of viewing an idea. You can find statistics that show cigarettes are killers and that they have no effect on anyone's health. You can find statistics that say you should cut down on the consumption of dairy products and that dairy products are good for you. You can find statistics that prove that so ft drinks will give you cancer and that they have no effect on anything but your thirst (or even that they make you thirstier). Every one of these sets of statistics is absolutely true.

The phrase "numbers don't lie" is true; what you need to examine is who is publishing the numbers, and what are they trying to prove with them. Are the statistics provided by the American Cancer Society or the American Tobacco Institute? Are they provided by the American Medical Association or the American Dairy Association? Are they provided by the Cancer Institute or the United States Food and Drug Administration? (Did the latter give you pause? It should. Both are reputable. Yet both have differing opinions based on statistics.)

Every point of view uses statistics to support their ideas. It's your job to examine all statistics supporting all points of view, to arrive at your own conclusions based on all of them. If you can't arrive at a conclusion, do your own study. An easier course, naturally, is to find out what all possible sides have to say and what other evidence they have in support of their statistics.

Once you have determined whether or not there is prejudice involved in the statistics (please recall that subjectivity is unavoidable), then it is time to move on to the next question: what are the statistics measuring?

What are the Statistics Measuring

When asking yourself, "what are the statistics measuring," bear in mind the old saw about measuring apples and oranges. Most people will say that you can't compare apples and oranges. This is both true and false. It depends on WHAT YOU ARE MEASURI NG. Color? No. Texture? No. Overall appearance? No. Acidity? Yes. Sugar content? Yes. Vitamin, mineral, carbohydrate, or fat content? Yes.

As you can see, it is possible to compare apples and oranges, if you know what you are measuring. Your job, in using statistics as evidence, is to determine what exactly is being measured, and not simply spout numbers that seem to apply to your topic. If your topic is "Nutritional Value of Oranges," statistics proving that apples are nothing like oranges may be measuring the wrong things.

Who was Asked?

Once you've determined what the statistics are measuring, you next need to find out how the research was done. Many studies, the results of which are disseminated using statistics, are done by asking people their opinions or what they do or think or feel or . . .. Such studies include political, sociological, consumer behavior, media audience, and other areas which are based on individual people's ideas, opinions and/or attitudes.

Such areas are often referred to as "soft sciences", as opposed to "hard sciences" that do research designed to minimize as much as possible the human factor in the evidence and conclusions. The "human factor" is, naturally, impossible to eliminate totally as long as humans are involved, but the studies, to be "scientific," must be repeatable and predictive in nature. That is, once a study has been done, equivalent results must appear when the study is done again by other researchers who have no connection with the original researchers, and the results should allow researchers to say what will happen next.

Let us say that scientific statistics show meteors fall during a specific period (say, August) at an average rate (say, 60 per hour). This study is repeated several years during August and the rate stays the same. Thus the study is repeatable. From those statistics it is possible to predict that in future years the average rate of shooting stars in August will continue to be 60 per hour. In this case, "who is being asked" are the impersonal forces of nature.

It is the soft sciences that most often, intentionally or unintentionally, misuse or misapply statistics. The studies are often not repeatable and usually not predictive. The reason for this is that people and what they say or do are the bases of t he statistics. It seems axiomatic that people will perversely refuse to say or do the same thing twice running, or let anyone predict what they will do. In fact, many people consider themselves insulted when called predictable, and anything from the weather to the time of day to who's asking the question can change what they will say or do about something.

What does this mean to you as you examine the statistics you plan on using as evidence? First, try to determine whether the statistics are hard or soft science based. The simplest way to do this is simply find out if people or nature is being studied. If nature it's hard science, if people it's soft.

Second, if the statistics are hard science, check to see what results other researchers who have repeated the study obtained. If the second study has results that vary widely from the first, find a third and/or fourth and use the results that are consistent overall.

Of course, hard science statistics often require that you examine who was asked. Check the sample: if the statistics say that 30% of the US population has AIDS, what was the sample? The entire population of the US? The population of New York or San Francisco? The population of Otumwa, Iowa? Or a selection of towns and cities, rural, urban and suburban, in all parts of the country? Statistics on the incidence of rape in the US vary wildly depending on whether the study asks law enforcement or rape counseling centers (one set is based on the number of reported rapes, the other on the number of women needing counseling whether or not they reported the rape to law enforcement). Both examples above appear to be hard science, since they are based on "hard" facts, but nonetheless must be examined for who was asked.

Soft science statistics are even more slippery than hard science statistics. First, there are few hard, repeatable, non-subjective facts on which to base the statistics. If you wish to show how people react to violence, how do you define violence? And how do the people in your study define violence (a victim of a mugging may define violence as getting within five feet of him, while a mugger may define it as anything that happens that causes him physical damage (what he does to others is simply high spirits)).

Also bear in mind that any study that uses human subjects is almost impossible to conduct under laboratory conditions, in which all factors that could effect the outcome of the experiment are controlled, including the variable under study. For a truly statistically valid study showing the effects of television violence on children, the children would have to isolated from all other factors that could have an influence. These other factors would include contact with other human beings, with other expressions of violence (people, reading, radio, movies, newspapers, video games, etc.). This would obviously work to the social and developmental detriment of the children.

As a matter of fact, a recent controversy arouse over using medical data collected by the Nazis in the concentration camps. These data were collected with absolutely no regard for the fact that the test subjects were human beings; they were treated much worse than any laboratory animal in the world today. Ethical and moral considerations aside, the data are viewed as valuable. However, there are people who believe that the ethical and moral considerations are paramount, and that the data, no matter how valuable, should be destroyed because of the way they were gathered.

In addition to the fact that any study involving humans must take into account human and humane considerations, you should never underestimate the perversity of a human being. In studying comedy one of the first things I learned was never tell the audience I was going to be funny. The moment a comedian says to an audience, "You're really going to find this funny," the same audience that moments before was falling out of their chairs laughing will turn cold and silent, with an "Oh, yeah? Show me" attitude.

In the same vein, a truism in advertising is that fifty percent of advertising works; the problem is no one can figure out which fifty percent. The reason is that no one can really figure out what will influence people to buy products.

To try to understand "soft" statistics, let's take a look at advertising research and consumer behavior, both of which are subsets of socio- and psychological research. In particular, we'll look at some basic axioms of consumer research that apply to any soft statistics.

First is the realization that all people are different. No two people, not even identical twins, are exactly the same background and upbringing, have had the same conversations in the same words, have read the same books or magazines or newspapers at exactly the same time, or done anything the same as anyone else. This fact is precisely the opposite of what is necessary to statistics -- that there are similarities that give significance to the variables.

There are, of course, some factors that many people have in common with other people, and upon them statistics depend. These factors can include the society in which they live, their social class, whether they are urban, suburban or rural; their relationships -- most people have had a mother and father, perhaps siblings, friends of the same or opposite sex; and their interests: sports, television, reading science fiction or mysteries or romances. Of course, not everybody fits into all categories. Again, all people are different, but they do have some things in common.

What the above means is that no statistic has any application to an individual, but can have an application to the group. However, the statistics are determined on the basis of studying individuals in the group, not studying the group. Now recall the problems with individuals. First, individuals change, not only from year to year but from moment to moment.

Second, individuals are inconsistent. What they like today they may hate the next. You may love spaghetti, but eat it five days in a row, and you may find the thought of eating it again nauseating.

Third, individuals often don't know what they want, and even if they do, they don't know or can't tell you why.

Then there are a few problems involved in surveying individuals to gather the information to formulate the statistics. First, people often can't remember information about themselves and thus the background can be incomplete. If you don't believe this, recall exactly when you got your last tetanus booster shot, or the grade you got in freshman English in high school.

Second, there is a prestige bias. Answers a person gives involve the person personally -- his or her pride, self-esteem and self-image are involved. Thus people will often give an answer that will heighten their image. According to TV viewing diaries, nobody watches professional women's wrestling, but Masterpiece Theatre has a 50 rating. In some classes a few years ago I ran a survey that, as a part of the background, asked "How many hours do you watch television during an average week." The average answer was seven hours per week (please recall that the national average is seven hours per day). Granted, college students do not usually have a great deal of time to devote to watching TV, but the classes in which I gave this survey were advertising and mass media criticism, both of which require watching television. What's more, for people who avowed little interest in television, these same students had a near encyclopedic knowledge of details about programs and/or commercials that were discussed, in many cases rivaling my own (I watch television an average of eight hours per day). It was clear that the responses on the survey bore little relationship to reality. Nonetheless, I was not surprised at the responses. Television watching traditionally has a prestige problem, and prestige bias clearly influenced how people answered the question.

Third, people lie. That may seem a bit blunt, but there is no reason to sugarcoat. People not only stretch the truth, fib or misspeak themselves. They lie. Ask them a question and, just for the hell of it they may lie. They may lie because they find the truth uncomfortable or embarrassing, or because they simply want to screw up your results. With lying a virtual social necessity (do you really tell your best friend that his or her breath could knock a buzzard off a honey wagon?), the fact the people lie when responding to studies should come as no surprise.

Finally, many studies not only try to find out what people do, but why they do it. Here the problem lies in respondents' inability to articulate or explain their true feelings and motivations. Many people do things because it "feels" like the thing to do, but they cannot explain what that feeling is or how it arose. They will do the best they can, but since so many such feelings are subconscious and/or based on a priori assumptions, they have never been examined and put into words.

How Were They Asked?

It is not only the respondents but the questioners that contribute their own prejudice to the gathering of facts.

Two things that are used in surveys and statistical studies are questions and answers. First, let's examine the questions.

Researchers generally have an idea what their research is looking for. They thus formulate questions that will illuminate their research, either pro or con. Prejudice can creep in when a researcher unconsciously words questions in such a way that the answers support his or her contention or opinion. Various questions of this type are leading questions, loaded questions, and double-barreled questions.

Leading questions are those that tell the respondent how to answer. Attorneys sometimes use them. For example, "Is it not true that on the night of the 27th you were drunk?" Such a question leads the respondent to say yes. Asking instead, "Were you drunk on the night of the 27th?" does not tell the witness how to respond.

Loaded questions are those that, no matter how they are answered, the respondent loses. "Are you still beating your wife?" and "Are you still cheating on your income tax?" are examples. A loaded question appears to ask for a yes or no answer, yet the actual answer may be neither yes nor no.

Double-barreled questions are those that ask for more than one piece of information in the same question. For example, "Do you go up or downtown in the afternoon?" is double-barreled.

Another point to be considered is how the questions were worded. It is easy, and often subconscious, for the questioner to word the questions in such a way as to lead to respondent to reply in a certain way. For example, a survey on whaling could ask, "Should the only three countries in the world that do so, continue to slaughter to extinction the helpless, harmless intelligent giants of the deep?" I surmise that few people would respond with a yes.

It is the answers that sometimes cause difficulty for a researcher. The problems lie not only in how the respondents answer, but in how the researcher responds to the answer. Sometimes the response is not what the researcher wants or needs and/or contradicts expectations. He or she must then account for the anomaly. He or she may revamp the original concept or theory, revamp the study, or even ignore the data. The researcher may fall prey to selective perception (seeing only what you want to see) or cognitive dissonance (rationalizing away anything that doesn't fit into your preconceptions). In addition, how the researcher interprets the words in the questions may be at odds with how the respondents interpreted the words. For example, in a recent survey on the incident of rape on college campuses, the questions used words such as unwelcome sexual advance; the researcher interpreted unwelcome sexual advance as rape, while the respondents could well have been referring to a drunk at a bar making a pass, something that most people would accept as disgusting, but not rape.

The order of the questions can also be a problem. Often, the questions can lead a respondent to answer in a certain way because he or she has answered all the previous questions in the same way. In sales, it's a common technique, that can lead a respondent through a series of yes answers, from "it's a nice day," to "sign here."

Thus "How were they asked?" requires an examination of the original study in order to see if the researcher may have made an error in questioning and in understanding the answers.

Compared with What?

Finally, you need to examine statistics to determine what are the comparisons being drawn and are they relevant and valid. For example, say your topic is gun control. You could find statistics on murder rates with handguns per capita in New York City, London and Tokyo. Such statistics would show much higher rates in New York than the other two cities. It would therefore appear that gun control is a good idea since guns are controlled in London and Tokyo. However, such statistics must be suspect, not because they are wrong (more people are indeed murdered with handguns in New York City than in London or Tokyo), but because they don't tell the whole story.

For instance, New York has an extremely stringent weapons control law (the Sullivan Act). Since this is the case, what happens to the argument that control laws work? There must be something else influencing the murder rate.

What about the culture? The United States is unlike any other country on Earth. Its society has a tradition of independence and self-sufficiency, where if you have a problem it is normal for you to take care of it yourself, even if you can't. It is also a country that used to be called "the melting-pot" but is now known as the "mosaic", with New York City a patchwork of often conflicting cultures, languages, customs and attitudes. Add in the traditions of the old West and "gunslinging" becomes an apparently viable option to solve problems. Japan, on the other hand, is an extremely homogenous and traditional culture, with little in the way of overt class or cultural conflict. England is also very traditional with far less cultural conflict (any country that feels no necessity to arm their police does not have a tradition of individual use of force to solve problems). However, now as England is becoming more culturally and ethnically diverse, there is a rising incidence of violence and use of guns.

From the above it is clear that any statistics on murder rates says nothing about the efficacy of gun control laws, but rather about the cultural and/or societal factors that make such laws ineffective. If you wish statistics to serve as evidence for a gun control law, find something else.

For the above reasons you must search for other evidence to support whatever statistics you use as support, if only to show that the statistics actually apply.

Do not, however, take all the problems outlined above as a condemnation of statistics as evidence. Statistics are excellent evidence, and often the easiest and most concise way to express evidence. I merely wish you to be aware you must examine them for relevance, validity and authority or they can do you more harm than good in proving your point.

Go to Forms of Evidence Contents Page

Go to The Process of Research Page

Return to Research Contents Page

Return to Taflinger's Home Page

You can reach me by e-mail at: richt@turbonet.com

This page was created by Richard F. Taflinger. Thus, all errors, bad links, and even worse style are entirely his fault.

Copyright © 1996, 2011 Richard F. Taflinger.
This and all other pages created by and containing the original work of Richard F. Taflinger are copyrighted, and are thus subject to fair use policies, and may not be copied, in whole or in part, without express written permission of the author richt@turbonet.com.

Disclaimers
The information provided on this and other pages by me, Richard F. Taflinger (richt@turbonet.com), is under my own personal responsibility and not that of Washington State University or the Edward R. Murrow College of Communication. Similarly, any opinions expressed are my own and are in no way to be taken as those of WSU or ERMSC.

In addition,
I, Richard F. Taflinger, accept no responsibility for WSU or ERMCC material or policies. Statements issued on behalf of Washington State University are in no way to be taken as reflecting my own opinions or those of any other individual. Nor do I take responsibility for the contents of any Web Pages listed here other than my own.