BUY THIS BOOK
Add to Cart

Print Book $29.99

Add to Cart

Print+Electronic $38.99

Add to Cart

Electronic $23.99

Safari Books Online

What is this?

Add to UK Cart

Print Book £20.99

What is this?

Looking to Reprint or License this content?


Statistics Hacks
Statistics Hacks Tips & Tools for Measuring the World and Beating the Odds By Bruce Frey
May 2006
Pages: 356

Cover | Table of Contents | Colophon


Table of Contents

Chapter 1: The Basics
There's only a small group of tools that statisticians use to explore the world, answer questions, and solve problems. It is the way that statisticians use probability or knowledge of the normal distribution to help them out in different situations that varies. This chapter presents these basic hacks.
Taking known information about a distribution and expressing it as a probability [Hack #1] is an essential trick frequently used by stat-hackers, as is using a tiny bit of sample data to accurately describe all the scores in a larger population [Hack #2]. Knowledge of basic rules for calculating probabilities [Hack #3] is crucial, and you gotta know the logic of significance testing if you want to make statistically-based decisions [Hacks #4 and #8].
Minimizing errors in your guesses [Hack #5] and scores [Hack #6] and interpreting your data [Hack #7] correctly are key strategies that will help you get the most bang for your buck in a variety of situations. And successful stat-hackers have no trouble recognizing what the results of any organized set of observations or experimental manipulation really mean [Hacks #9 and #10].
Learn to use these core tools, and the later hacks will be a breeze to learn and master.
Statisticians know one secret thing that makes them seem smarter than everybody else.
The primary purpose of statistics as a scientific methodology is to make probability statements about samples of scores. Before we jump into that, we need some quick definitions to get us rolling, both to understand this hack and to lay a foundation for other statistics hacks.
Samples are numeric values that you have gathered together and can see in front of you that represent some larger population of scores that you have not gathered together and cannot see in front of you. Because these values are almost always numbers that indicate the presence or level of some characteristic, measurement folks call these values scores. A probability statement
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Know the Big Secret
Statisticians know one secret thing that makes them seem smarter than everybody else.
The primary purpose of statistics as a scientific methodology is to make probability statements about samples of scores. Before we jump into that, we need some quick definitions to get us rolling, both to understand this hack and to lay a foundation for other statistics hacks.
Samples are numeric values that you have gathered together and can see in front of you that represent some larger population of scores that you have not gathered together and cannot see in front of you. Because these values are almost always numbers that indicate the presence or level of some characteristic, measurement folks call these values scores. A probability statement is a statement about the likelihood of some event occurring.
Probability is the heart and soul of statistics. A common perception of statisticians, in fact, is that they mainly calculate the exact likelihood that certain events of interest will occur, such as winning the lottery or being struck by lightning. Historically, the person who had the tools to calculate the likely outcome of a dice game was the same person who had the tools to describe a large group of people using only a few summary statistics.
So, traditionally, the teaching of statistics includes at least some time spent on the basic rules of probability: the methods for calculating the chances of various combinations or permutations of possible outcomes. More common applications of statistics, however, are the use of descriptive statistics to describe a group of scores, or the use of inferential statistics to make guesses about a population of scores using only the information contained in a sample of scores. In social science, the scores usually describe either people or something that is happening to them.
It turns out, then, that researchers and measurers (the people who are most likely to use statistics in the real world) are called upon to do more than calculate the probability of certain combinations and permutations of interest. They are able to apply a wide variety of statistical procedures to answer questions of varying levels of complexity without once needing to compute the odds of throwing a pair of six-sided dice and getting three 7s in a row.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Describe the World Using Just Two Numbers
Most of the statistical solutions and tools presented in this book work only because you can look at a sample and make accurate inferences about a larger population. The Central Limit Theorem is the meta-tool, the prime directive, the king of all secrets that allows us to pull off these inferential tricks.
Statistics provide solutions to problems whenever your goal is to describe a group of scores. Sometimes the whole group of scores you want to describe is in front of you. The tools for this task are called descriptivestatistics. More often, you can see only part of the group of the scores you want to describe, but you still want to describe the whole group. This summary approach is called inferentialstatistics. In inferential statistics, the part of the group of scores you can see is called a sample, and the whole group of scores you wish to make inferences about is the population.
It is quite a trick, though, when you think about it, to be able to describe with any confidence a population of values when, by definition, you are not directly observing those values. By using three pieces of information—two sample values and an assumption about the shape of the distribution of scores in the population—you can confidently and accurately describe those invisible populations. The set of procedures for deriving that eerily accurate description is collectively known as the Central Limit Theorem.
Inferential statistics tend to use two values to describe populations, the mean and the standard deviation.

Mean

Rather than describe a sample of values by showing them all, it is simply more efficient to report some fair summary of a group of scores instead of listing every single score. This single number is meant to fairly represent all the scores and what they have in common. Consequently, this single number is referred to as the central tendency of a group of scores.
Typically, the best measure of central tendency, for a variety of reasons, is the mean
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Figure the Odds
Will I win the lottery? Will I get struck by lightning and hit by a bus on the same day? Will my basketball team have to meet our hated rival early in the NCAA tournament? At its core, statistics is all about determining the likelihood that something will happen and answering questions like these. The basic rules for calculating probability allow statisticians to predict the future.
This book is full of interesting problems that can be solved using cool statistical tricks. While all the tools presented in these hacks are applied in different ways in different contexts, many of the procedures used in these clever solutions work because of a common core set of elements: the rules of probability.
The rules are a key set of simple, established facts about how probability works and how probabilities should be calculated. Think of these two basic rules as a set of tools in a beginner's toolbox that, like a hammer and screwdriver, are probably enough to solve most problems:
Additive rule
The probability of any one of several independent events occurring is the sum of each event's probability.
Multiplicative rule
The probability of a series of independent events all occurring is the product of each event's probability.
These two tools will be enough to answer most of your everyday "What are the chances?" questions.
When a statistician says something like "a 1 out of 10 chance of happening," she has just made a prediction about the future. It might be a hypothetical statement about a series of events that will never be tested, or it might be an honest-to-goodness statement about what is about to happen. Either way, she's making a statistical statement about the likelihood of an outcome, which is just about all statisticians ever say [Hack #1].
If the following statement makes some intuitive sense to you, then you have all the ability necessary to act and think like a stat hacker: "If there are 10 things that might happen and all 10 things are equally likely to happen, then any 1 of those things has a 1 out of 10 chance of happening."
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Reject the Null
Experimental scientists make progress by making a guess that they are sure is wrong.
Science is a goal-driven process, and the goal is to build a body of knowledge about the world. The body of knowledge is structured as a long list of scientific laws, rules, and theories about how things work and how they are. Experimental science introduces new laws and theories and tests them through a logical set of steps known as hypothesistesting.
A hypothesis is a guess about the world that is testable. For example, I might hypothesize that washing my car causes it to rain or that getting into a bathtub causes the phone to ring. In these hypotheses, I am suggesting a relationship between car washing and rainfall or between bathing and phone calls.
A reasonable way to see whether these hypotheses are true is to make observations of the variables in the hypothesis (for the sake of sounding like statisticians, we'll call that collecting data) and see whether a relationship is apparent. If the data suggests there is a relationship between my variables of interest, my hypothesis is supported, and I might reasonably continue to believe my guess is correct. If no relationship is apparent in the data, then I might wisely begin to doubt that my hypothesis is true or even reject it altogether.
There are four possible outcomes when scientists test hypotheses by collecting data. shows the possible outcomes for this decision-making process.
Table : Possible outcomes of research hypothesis testing
Hypothesis is correct: the world really is this way Hypothesis is wrong: the world really is not this way
Data does support hypothesis: accept hypothesisA. Correct decision:
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Go Big to Get Small
The best way to shrink your sampling error is to increase your sample size.
Whenever researchers are playing around with samples instead of whole populations, they are bound to make some mistakes. Because the basic trick of inferential statistics is to measure a sample and use the results to make guesses about a population [Hack #2], we know that there will always be some error in our guesses about the values in those populations. The good news is that we also know how to make the size of those errors as small as possible. The solution is to go big.
An early principle suggested in a gambling context was presented by Jakob Bernoulli (in 1713), who called his principle the Golden Theorem. It was later labeled by others (starting with Sim\x8e on-Denis Poisson in 1837) as the Law of Large Numbers. It is likely the single most useful discovery in the history of statistics and provides the basis for the key generic advice for all researchers: increase your sample size!
The early history of the science of applied statistics (we're talking the 17th and 18th centuries) is framed in the language of gambling and probability. This might be because it gave the gentlemen scholars of the time an excuse to combine their intellectual pursuits with pursuits of a less intellectual nature. The Laws of Probability, of course, are legitimately the mathematical basis for statistical procedures and inferences, so it might be that gambling applications were used simply as the best teaching examples for these central statistical concepts.
One application of the Law is its effect on probability and occurrences. The Law includes the consequence that the increase in the accuracy of predicting outcomes governed by chance is a set amount. That is, the increase in accuracy is known. The expected distance between the probability of a certain outcome and the actual proportion of occurrences you observe decreases as the number of trials increases, and the exact size of this expected gap between expected and observed can be calculated. The generic name for this expected gap is the standard error [Hack #18].
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Measure Precisely
Classical test theory provides a nice analysis of the components that combine to produce a score on any test. A useful implication of the theory is that the level of precision for test scores can be estimated and reported.
A good educational or psychological test produces scores that are valid and reliable. Validity is the extent to which the score on a test represents the level of whatever trait one wishes to measure, and the extent to which the test is useful for its intended purpose. To demonstrate validity, you must present evidence and theory to support that the interpretations of the test scores are correct.
Reliability is the extent to which a test consistently produces the same score upon repeated measures of the same person. Demonstrating reliability is a matter of collecting data that represent repeated measures and analyzing them statistically.
Classical test theory, or reliabilitytheory, examines the concept of a test score. Think of the observed score (the score you got) on a test you took sometime. Classical test theory defines that score as being made up of two parts and presents this theoretical equation:
Observed Score = True Score + Error Score
This equation is made up of the following elements:
Observed score
The actual reported score you got on a test. This is typically equal to the number of items answered correctly or, more generally, the number of points earned on the test.
True score
The score you should have gotten. This is not the score you deserve, though, or the score that would be the most valid. True score is defined as the average score you would get if you took the same test an infinite number of times. Notice this definition means that true scores represent only average performance and might or might not reflect the trait that the test is designed to measure. In other words, a test might produce true scores, but not produce
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Measure Up
Four levels of measurement determine how the scores produced in measurement can be used. If you have not measured at the right level, you might not be able to play with those scores the way you want.
Statistical procedures analyze numbers. The numbers must have meaning, of course; otherwise, the exercises are of little value. Statisticians call numbers with meaning scores. Not all the scores used in statistics, however, are created equal. Scores have different amounts of information in them, depending on the rules followed for creating the scores.
When you decide to measure something, you must choose the rules by which you assign scores very carefully. The level of measurement determines which sorts of statistical analyses are appropriate, which will work, and which will be meaningful.
Measurement is the meaningful assignment of numbers to things. The things can be concrete objects, such as rocks, or abstract concepts, such as intelligence.
Here's an example of what I mean when I say not all scores are created equal. Imagine your five children took a spelling test. Chuck scored a 90, Dick and Jan got 80s, Bob scored 75, and Don got only 50 out of 100 correct. If a friend asked how your kids did on the big test, you might report that they averaged 75. This is a reasonable summary. Now, imagine that your five children ran a foot race against each other. Bob was first, Jan second, Dick third, Chuck fourth, and Don fifth. Your nosey friend again asks how they did. With a proud smile, you report that they averaged third place. This is not such a reasonable summary, because it provides no information. In both cases, though, scores were used to indicate performance. The difference lies only in the level of measurement used.
There are four levels of measurement—that is, four ways that numbers are used as scores. The levels differ in the amount of information provided and the types of mathematical and statistical analyses that can be meaningfully conducted on them. The four levels of measurement are
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Power Up
Success in social science research is typically defined by the discovery of a statistically significant finding. To increase the chances of finding something, anything, the primary goal of the statistically savvy super-scientist should be to increase power.
There are two potential pitfalls when conducting statistically based research. Scientists might decide that they have found something in a population when it really exists only in their sample. Conversely, scientists might find nothing in their sample when, in reality, there was a beautiful relationship in the population just waiting to be found.
The first problem is minimized by sampling in a way that represents the population [Hack #19]. The second problem is solved by increasing power.
In social science research, a statistical analysis frequently determines whether a certain value observed in a sample is likely to have occurred by chance. This process is called a test of significance. Tests of significance produce a p-value (probability value), which is the probability that the sample value could have been drawn from a particular population of interest.
The lower the p-value, the more confident we are in our beliefs that we have achieved statistical significance and that our data reveals a relationship that exists not only in our sample but also in the whole population represented by that sample. Usually, a predetermined level of significance is chosen as a standard for what counts. If the eventual p-value is equal to or lower than that predetermined level of significance, then the researcher has achieved a level of significance.
Statistical analyses and tests of significance are not limited to identifying relationships among variables, but the most common analyses (t tests, F tests, chi-squares, correlation coefficients, regression equations, etc.) usually serve this purpose. I talk about relationships here because they are the typical effect you're looking for.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Show Cause and Effect
Statistical researchers have established some ground rules that must be followed if you hope to demonstrate that one thing causes another.
Social science research that uses statistics operates under a couple of broad goals. One goal is to collect and analyze data about the world that will support or reject hypotheses about the relationships among variables. The second goal is to test hypotheses about whether there are cause-and-effect relationships among variables. The first goal is a breeze compared to the second.
There are all sorts of relationships between things in the world, and statisticians have developed all sorts of tools for finding them, but the presence of a relationship doesn't mean that a particular variable causes another. Among humans, there is a pretty good positive correlation [Hack #11] between height and weight, for example, but if I lose a few pounds, I won't get shorter. On the other hand, if I grow a few inches, I probably will gain some weight.
Knowing only the correlation between the two, however, can't really tell me anything about whether one thing caused the other. Then again, the absence of a relationship would seem to tell me about cause and effect. If there is no correlation between two variables, that would seem to rule out the possibility that one causes the other. The presence of the correlation allows for that possibility, but does not prove it.
Researchers have developed frameworks for talking about different research designs and whether such designs even allow for proof that one variable affects another. The different designs involve the presence or absence of comparison groups and how participants are assigned to those groups.
There are four basic categories of group designs, based on whether the design can provide strong evidence, moderate evidence, weak evidence, or no evidence of cause and effect:
Non-experimental designs
These designs usually involve just one group of people, and statistics are used to either describe the population or demonstrate a relationship between variables. An example of this design is a correlational study, where simple associations among variables are analyzed [Hack #11]. This type of design provides no evidence of cause and effect.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Know Big When You See It
You've just read about an amazing new scientific discovery, but is such a finding really a big deal? By applying effect size interpretations, you can judge the importance of such announcements (or lack thereof) for yourself.
Something is missing in most reports of scientific findings in nonscientific publications, on TV, on the radio, and—do I even have to mention—on the Web. Although reports in such media typically do a pretty good job of only reporting findings that are "statistically significant," this is not enough to determine whether anything really important or useful has been discovered. A big drug study can report "significant" results, but still not have found anything of interest to the rest of us or even other researchers.
As we repeat in many places in this book, significance [Hack #4] means only that what you found is likely to be true about the bigger population you sampled from. The problem is that this fact alone is not nearly enough for you to know whether you should change your behavior, start a new diet, switch drugs, or reinterpret your view of the world.
What you need to know to make decisions about your life and reality in light of any new scientific report is the size of the relationship that has just been brought to light. How much better is brand A than brand B? How big is that SAT difference between boys and girls in meaningful terms? Is it worth it to take that half an aspirin a day, every day, to lower your risk of a heart attack? How much lower is that risk anyway?
The strength of that relationship should be expressed in some standardized way, too. Otherwise, there is no way to really judge how big it is. Using a statistical tool known as effect size will let you know big when you see it.
An effect size is a standardized value that indicates the strength of a relationship between two variables. Before we talk about how to recognize or interpret effect sizes, let's begin with some basics about relationships and statistical research.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 2: Discovering Relationships
There are invisible webs of relationships around us. Variable A causes Variable B, which influences Variable C, which is entirely independent of Variable D, unless Variable E comes into play. The hacks in this chapter allow you to discover these connections and describe them accurately. These are the hacks that reveal the hidden reasons for why people do the things they do and why things are the way they are.
The connections between one trait and another, between a cause and an effect, are relationships that are easily revealed—with the right tricks. Begin by identifying the strength of any association [Hack #11], and then draw what it looks like [Hack #12]. Next, use your knowledge of that relationship to make predictions [Hack #13], and then improve the accuracy of those predictions [Hack #14]. Some relationships appear through the observation of unexpected occurrences [Hacks #15 and #16] or by noticing real differences between groups [Hack #17].
Because we cannot measure every example of a person, fish, or pine tree that we might be interested in, we must rely on representative samples [Hack #19] to provide our observations. Sampling can mislead us [Hack #18], however, or it can work in surprisingly cool ways [Hack #20].
To share your findings with others or understand what these findings have to tell you, you need to avoid both being deceived and deceiving others. Be careful not to misinterpret any numbers [Hack #21] or pictures [Hack #22].
Pack these tools in your tool belt and head out to find whatever there is to find.
Revealing the invisible connections in the world is just a matter of recording observations and computing the magical, mystical correlation coefficient.
You probably make all sorts of assumptions about why people feel the way they feel or do the things they do. Statistical researchers would call these assumptions hypotheses about the relationship among variables.
Regardless of what science calls it, you probably do it. You might make these guesses about associations between attitudes and behavior or between attitudes and attitudes or behaviors and behaviors. You might do it informally as you seek to understand people in the world around you, or you might need to do it as a marketing specialist to understand your customer, or you might be a struggling psychology graduate student who needs to complete a class assignment that requires statistical analysis of the relationship between self-esteem and depression.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Discover Relationships
Revealing the invisible connections in the world is just a matter of recording observations and computing the magical, mystical correlation coefficient.
You probably make all sorts of assumptions about why people feel the way they feel or do the things they do. Statistical researchers would call these assumptions hypotheses about the relationship among variables.
Regardless of what science calls it, you probably do it. You might make these guesses about associations between attitudes and behavior or between attitudes and attitudes or behaviors and behaviors. You might do it informally as you seek to understand people in the world around you, or you might need to do it as a marketing specialist to understand your customer, or you might be a struggling psychology graduate student who needs to complete a class assignment that requires statistical analysis of the relationship between self-esteem and depression.
In statistics, such a relationship is called a correlation. The number describing the size of that relationship is a correlation coefficient. By computing this useful value, you can get answers to any question you have about relationships (except in terms of dating relationships; you're on your own there).
Imagine a study in which a researcher for the American Cheesecake Sellers Association has a hypothesis that the reason people like cheesecake is that they like cheese. She is guessing that there is a relationship between attitude toward cheese and attitude toward cheesecake. If her hypothesis turns out to be correct, she'll purchase the huge mailing list of cheese lovers from the American Cheese Lovers Association and send them informative brochures about the healing properties of cheesecake. If she's right, sales will rocket up!
To test her hypothesis, she creates two surveys. One asks respondents to say how they feel about cheese, and the other asks how they feel about cheesecake. A score of 50 means the person loves cheese (or cheesecake), and a score of 0 means the person hates cheesecake (or cheese). shows the results for the data she collects from five people on the bus on her way to work.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Graph Relationships
Whenever a relationship between two variables is discovered and defined, we can use one variable to guess another. Drawing a regression line allows you to picture the relationship and make predictions.
So, you've just been named assistant regional manager of ice cream sales for 10,000 square feet of prime beachfront retail space along the shores of Sunflower Lake in northeast Kansas. Congratulations! You have a lot of responsibility and many strategic decisions to make about how to maximize profit. One dilemma that you will confront is whether to even open. Being open costs money and uses resources, and if you will sell few ice cream cones that day, it probably won't be worth it to even unlock the service window of your brightly painted plywood shack.
If only there were some way to magically know how good business will be on any given day. As an amateur statistician, you assume there must be a scientific way to guess how many cones will sell without having to actually open for business and test the market for the day. You're in luck. There is a way to make estimates of the value or score on some variable (such as ice cream sales) by using other information.
The key is that the other information must come from a variable that is related to the variable of interest. By drawing a line that shows the relationship among your variables for the days you know, you can look at the line as it extends into the future (or the past) for the days you do not know and guess what will happen. Such a graphic tool is called a regression line.
Observant folks often discover correlations between variables [Hack #11]. The usefulness of knowing that a relationship exists goes beyond descriptive statistics, however.
Imagine that you have data on the activities around Sunflower Lake. Among other things, you have collected information about the amount of ice cream sales under the former assistant regional manager of ice cream sales (in number of ice cream cones sold) and the high temperature for each day (in degrees Fahrenheit). The correlation coefficient that represents the relationship between heat and craving for ice cream should be positive and fairly large. That is, as the heat increases, sales probably increase.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Use One Variable to Predict Another
Simple linear regression is a powerful tool for measuring something you cannot see or for predicting the outcome of events that have not happened yet. With some help from our special friend statistics, you can make a precise guess of how someone will score on one variable by looking at performance on another.
Many professionals, both in and outside of the social sciences, often need to predict how a person will perform on some task or score on some variable, but they cannot measure the critical variable directly. This is a common need when making admission decisions into college, for example. Admissions officers want to predict college performance (perhaps grade point average or years until completion). However, because the prospective student has not actually gone to college yet, admissions officers must use whatever information they can get now to guess what the future holds.
Schools often use scores on standardized college admissions tests as an indicator of future performance. Let's imagine that a small college decides to use scores on the American College Test (ACT) as a predictor of college grade point average (GPA) at the end of students' first years. The admissions office goes back through a few years of records and gathers the ACT scores and freshman GPAs for a couple hundred students. They discover, to their delight, that there is a moderate relationship between these two variables: a correlation coefficient of .55.
Correlation coefficients are a measure of the strength of linear relationships between two variables [Hack #11], and .55 indicates a fairly large relationship. This is good news because the existence of a relationship between the two makes ACT scores a good candidate as a predictor to guess GPA.
Simple linear regression is the procedure that produces all the values we need to cook up the magic formula that will predict the future. This procedure produces a regression line that we can graph to determine what the future holds [Hack #12], but once we have the formula, we don't actually need to do any graphing to make our guesses.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Use More Than One Variable to Predict Another
The super powers of predicting the future and seeing the invisible are available to any statistics hackers who feel they are worthy. Statisticians often answer questions and use correlational information to solve problems by using one variable to predict another. For more accurate predictions, though, several predictor variables can be combined in a single regression equation by using the methods of multiple regression.
"Graph Relationships" [Hack #12] discusses the useful prophetic qualities of a regression line. Those procedures allow administrators and statistical researchers to predict performance on assessments never taken, understand variables, and build theories about relationships among those variables. They accomplish these tricks using just a single predictor variable.
"Use One Variable to Predict Another" [Hack #13] presents the problem colleges have when deciding which applicants to admit. They want to admit students who will succeed, so they try to predict future performance. The solution in that hack uses one variable (a standardized test score) to estimate performance on a future variable (college grades).
Often, real-life researchers want to make use of the information found in a bunch of variables, not just one variable, to make predictions or estimate scores. When they want greater accuracy, scientists attempt to find several variables that all appear to be related to the criterion variable of interest (the variable you are trying to predict). They use all this information to produce a multiple regression equation.
You probably should read or reread "Use One Variable to Predict Another" [Hack #13] before going further with this hack, just to review the problem at hand and how regression solves it. Here is the equation we built in that hack for using a single predictor, ACT scores, to estimate future college admission:
Predicted GPA = -.24 + (ACT Scorex.16)
This single predictor produced a regression equation with output that correlated .55 with the criterion. Pretty good, and pretty accurate, but it could be better.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Identify Unexpected Outcomes
How do you know if your observations are correct or if you are just biased? How do you know when there is more or less of something than should have occurred by chance? You can find out for sure by using the flexible one-way chi-square test.
In science, the oldest type of observational research involved counting people, animals, and things:
  • How many people are on this boat?
  • What proportions of butterflies have little green spots on their wings?
As the field of inferential statistics matured, the questions became more specific:
  • Were an equal number of boys and girls born in London in 1812?
  • Are an equal number of crimes committed at different times of day?
The research question for these situations is "are they equal?" (or, at least, are they close enough that any fluctuations are probably due to chance). The implication of an unequal distribution is that something is going on. What, exactly, is going on cannot be answered by this sort of question. It is a start, though, and a good first question to ask.
Have you ever noticed that something seemed to be going on, but weren't sure if it was just your imagination? Do a greater number of hippies shop at the local community mercantile than would be expected by chance? If the answer is yes, and you are looking to meet hippies, you should start hanging out there.
In business, and for those who have to provide services, identifying where there is the most need is crucial. Observational data can be used to solve that problem. Even just in everyday life, we all have our beliefs (which might be biased) that are based on observations. I have noticed a lot of hippies at the community mercantile, but maybe I am just on the lookout for hippies when I am in that store. Are there really more hippies than normal there? More hippies than, say, nonhippies?
These sorts of questions can be answered using a statistical tool appropriate for seeing whether the number of "things" within each of a number of categories is more unequal than would normally be found by chance. This tool is named the
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Identify Unexpected Relationships
If you want to verify whether a relationship you have observed between two variables is real, you have a variety of statistical tools available. A problem arises, though, when you have measured these variables without much precision, using categorical measurement. The solution is a two-way chi-square test, which, among other things, can be used to make unsubstantiated assumptions about the characteristics of people you have just met.
"Identify Unexpected Outcomes" [Hack #15] used the one-way chi-square test to make police scheduling decisions based on whether equal numbers of crimes were committed at different times of day. That tool works well to solve any analytical problem when:
  • The data is at the categorical level of measurement (e.g., gender, political party, ethnicity).
  • You want to determine whether there is a greater frequency of scores in certain categories than would be expected by chance.
You face another common analytic problem when you're curious to know whether two categorical variables are related to each other. Relationships between categorical variables can be examined with the handy two-way chi-square test.
If two variables are measured at the interval level (many scores are possible along a continuum), the correlation coefficient [Hack #11] is the best tool to use, but it doesn't work well with categorical measurement.
We make assumptions all the time about relationships between these sorts of variables. Many of our common stereotypes about categories of people have implicit hypotheses about these relationships. Here are a few assumptions you might have that imply a relationship between categorical variables:
  • Professors are absent-minded.
  • Computer programmers play Dungeons and Dragons.
  • Adults who collect comics write Statistics Hacks books.
  • Professors are absent-minded.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Compare Two Groups
Which is better? Which has more? Do people really differ? Quantitative questions like these dominate the polite conversations of our times. If you want some real evidence for your beliefs about the best, most, and least, you can use a statistical tool called the "t test" to support your point.
My Uncle Frank is full of opinions. Green M&Ms taste better than blue. Women never get speeding tickets. The Brady Bunch kids could sing better than the Partridge Family. Plaid is back. He can argue all day spouting half-baked idea after half-baked idea. While I disagree with him on all four points (especially the position that plaid is back—after all, it never left!), I have only my opinions to fight with.
If only there were some scientific way to prove whether Uncle Frank is right or wrong! You no doubt recognize the rhetorical nature of my plea. After all, there are only about a gazillion statistical tools that exist to test hypotheses like these. One of the simplest tools is designed to test the simplest of claims. If the problem is deciding whether one group differs from another, the procedure known as an independent t test is the best solution.
To apply a t test to investigate one of Uncle Frank's theories, we have to compute a t value. Let's imagine that I decided to actually challenge Uncle Frank and collect some data to see whether he is right or wrong.
Uncle Frank believes that males get speeding tickets more frequently than females. To test this hypothesis, imagine that I select two groups of 15 drivers randomly [Hack #19] from his neighborhood. One group is female, and the other is male. I ask them some questions. Pretend that over the course of the last five years, the male group averaged 1.71 speeding tickets with a variance of .71. The female group averaged 1.35 speeding tickets with a variance of .25.
Variance is the total amount of variability in a given group of numbers. It is calculated by finding the distance of each score in the group from the mean score. Square those distances and average them to get the variance.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Find Out Just How Wrong You Really Are
Anytime you have used statistics to summarize observations, you've probably been wrong. If you need to know how close you have come to the truth, use standard errors.
Statisticians are perhaps the only professionals who not only proudly admit that their answers are probably wrong, but will go to great lengths to tell you exactly how wrong they are. When you conduct a survey, record observations, or conduct some sort of experiment, your results describe only your sample—the customers, patients, students, goldfish, or pieces of Kryptonite that you have in front of you. Inferential statistics uses values computed for a sample to estimate what that value would be for the population it is meant to represent. For example, the mean of a sample is a pretty good guess for the mean of the population. The problem is knowing whether to trust your results.
It is unlikely that the mean of a sample is exactly the same as the mean of the population, but it is likely to be close. If you want to know how far wrong you are, you can calibrate your precision using standard errors. The standard error of the mean gives us an estimate of the distance between our sample mean estimate and the actual population mean.
"Measure Precisely" [Hack #6] discusses how to use standard errors in the case of measurement. Calculating the standard error of measurement allows you to know how close your test score is to your typical level of performance. Just as measurement allows us to produce 95 percent confidence intervals around individual observed scores, statisticians routinely produce 95 percent confidence intervals around a wide variety of sample values.
Fortunately for anyone curious to know how far a statistical finding is from the hidden truth, every popular statistical procedure provides a standard error. After introducing the following basic concepts, this hack will explain how to apply the following standard errors:
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Sample Fairly
If you want to find something out about every single customer or employee in your business, you could talk to every single one of them. If you are concerned about the quality of the beer you serve at your bar, you could taste every one before serving. Or, to save time, money, and brain cells, "sample" efficiently instead.
Management thrives on knowing the characteristics of every widget produced, every transaction conducted, and every client helped. Of course, the whole set of all of these widgets, interactions, and people can never be brought together under one microscope and observed and evaluated. No specimen slide is big enough.
The same is true for those of us in social science—researchers interested in people simply cannot measure everybody. As much as we'd like to probe, shock, inject, hassle, embarrass, and generally bother everyone in the world, we just can't do it. We don't have the time, space, or money, and, frankly, no one really wants to get to know so many people.
The problem is, "How can you know about everything, without being able to look at everything?" As is the case with all hacks in this book, the solution is provided by statistics. There are scientifically sound ways to accurately describe any whole set of things by just looking at a small subset of those things.
Inferential statistics allows us to generalize to a larger population, based on data from a smaller sample. For these generalizations to be valid, though, the sample has to represent the population fairly.
A population, in the sense we use it here, is rarely the "population" of a country or city or planet in the way the term is used in social studies. In inferential statistics, a population is a description of the type of person or thing you're studying. Populations can be third-grade boys in Nebraska, nurses at Shawnee Mission Medical Center in Merriam, Kansas, South American giant otters, or books in the Library of Congress. The only rule is that a population is bigger than its corresponding sample.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Sample with a Touch of Scotch
When statisticians choose samples of people from populations, they are really sampling from continuous distributions of variables. Sampling is sometimes easier to understand, though, by treating your variables as discrete objects, not continuous scores.
The most powerful statistical procedures use scores at the interval level of measurement or higher [Hack #7]. To sample scores from a population, social science researchers usually choose people, though, not scores. The people are then measured, which results in a sample of scores. So far, so good.
When discussing the sampling process, however, smart researchers sometimes sound not-so-smart when they refer to their sampling strategy. For example, if a researcher is interested in measuring the effects of some treatment on a continuous variable such as happiness, he might say (and think), "OK, first I need to get a sample full of happy and unhappy people." He, at least for the moment of the thought, is treating happiness as if it were a dichotomous variable.
Dichotomous is statistics jargon meaning "having only two values." For example, biological sex is a dichotomous variable.
He is referring to people as if they are either completely happy or entirely unhappy. In reality, of course, he thinks there is a large range of happiness scores that describe people, which is why he is using statistics that make the assumption of interval measurement.
He refers to his participants as either/or because doing so makes it easier for him to picture the representativeness of his sampling. It's a smart strategy, because by thinking of samples as representing big, discrete categories instead of more precise, continuous values, this sometimes makes questions about sampling easier to answer and justify.
Here's a brainteaser that centers on a sampling question. A drunk, untenured statistician (I've met a few) is mixing drinks at a party. He is making a Scotch and soda for his department chair. The chair demands a drink with some exact proportion of Scotch to water (it doesn't matter what the specific request is; our hero never makes it that far).
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Choose the Honest Average
Data-driven decisions, such as whether you can afford to buy a house in a new town or who the core market is for your business, often rely on the "average" as the best description for a large set of data. The problem is that there are three completely different values that can be labeled as the "average," and the different averages often result in different decisions. Make your decisions using the correct average.
When most people hear a statement like "the average price for a house in this town is $290,000" (which might sound low, high, or just right, depending on where you call home), they imagine that this figure was determined by adding up all of the sales prices from all of the houses in the town, and then dividing that sum by the number of houses. But statisticians know there is more than one way to determine the "average," and sometimes one kind is better than another.
Whether that $290,000 really represents the typical housing price depends on whether the average is actually the mean, median, or mode. It also depends on the shape of the distribution of all the numbers that are averaged. Wise folks will make sure they are making their decisions using the best summary value. Here's when to trust each type of average.
The purpose of determining an average for a set of values—whether those values are house prices, grades from a final exam, or the number of students in a yoga class—is to efficiently communicate the central tendency for those values. It's true that, most of the time, central tendency is determined by adding up all of the values in a distribution, and then dividing the sum by the number of values. Statisticians don't call this the average, though; they call it the mean. So, why not always use the mean to determine central tendency? Because in some situations, the mean doesn't represent any of the actual values!
Consider the opening example about the average price of a house. Let's say you collect data for 300 houses in a town and want to determine the average sales price in that sample. Generally speaking, the mean is not a very good indicator of central tendency for house prices. illustrates why.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Avoid the Axis of Evil
Graphs are powerful tools to represent quantities, relationships, and the results of research studies. But in the wrong hands, they can be made to deceive. Choose your destiny, young Luke (or, if you are under the age of 25, "young Anakin"), and avoid the dark side.
There was a time when only scientists, engineers, and mathematicians ever saw a graph. With the advent of more and more news outlets aimed at the general public, visual representations of numeric information have become more and more common. Just think of yesterday's issue of USA Today—it contained at least a dozen graphs.
In business conferences, graphs are used frequently to communicate information and demonstrate success (or failure). If the creator of a graph isn't careful, though, choices that might seem arbitrary will affect the interpretation of the information. Without changing the data, you can change the meaning.
So, if you want to avoid manipulating your audience when you create a graph, or if you just want to be able to spot a misleading (whether intentional or not) chart, then use this hack to help you create and interpret graphs effectively.
To understand correct and incorrect graphing options, we first have to cover some graphing basics. There are various pieces to a graph, and the manipulation of those pieces can lead or mislead.
Typical graphs have two axes, because they describe two different variables. Axes are the lines along the bottom, called the X-axis, and along the side, called the Y-axis.
You can remember that the vertical axis is called the Y-axis because the cute little letter Y is reaching its cute little hands up, vertically, toward the sky. Get it? (Welcome to the creative world of statistics education.)
The sort of graph that is appropriate (and nondeceptive) for showing the variables you have measured depends on the level of measurement of your variables [Hack #7]. You can choose from three common types of graphs, and only one will be the right one for your variables:
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 3: Measuring the World
Hacks 23-34
There is great value in understanding phenomena by hanging a quantity on it. Though sometimes a something important is lost in the translation from idea to number, creating scores to represent whatever we are interested in does allow for a level of precision in understanding, and it also allows for comparison. These hacks all involve measurement and interpretation of scores.
A whole family of hacks relies on the normal distribution [Hack #23] and its presence everywhere we look. With the normal curve, you can tell where you stand compared to everyone else [Hack #24], know how you are likely to perform on a test before you even take it [Hack #25], and understand your test results at a deeper level [Hacks #26 and #27].
Speaking of testing, you'll learn how to produce a good set of questions [Hack #28] and make a qua