Book description
Wouldn't it be great if there were a statistics book that made histograms, probability distributions, and chi square analysis more enjoyable than going to the dentist? Head First Statistics brings this typically dry subject to life, teaching you everything you want and need to know about statistics through engaging, interactive, and thoughtprovoking material, full of puzzles, stories, quizzes, visual aids, and realworld examples.
Whether you're a student, a professional, or just curious about statistical analysis, Head First's brainfriendly formula helps you get a firm grasp of statistics so you can understand key points and actually use them. Learn to present data visually with charts and plots; discover the difference between taking the average with mean, median, and mode, and why it's important; learn how to calculate probability and expectation; and much more.
Head First Statistics is ideal for high school and college students taking statistics and satisfies the requirements for passing the College Board's Advanced Placement (AP) Statistics Exam. With this book, you'll:
 Study the full range of topics covered in firstyear statistics
 Tackle tough statistical concepts using Head First's dynamic, visually rich format proven to stimulate learning and help you retain knowledge
 Explore realworld scenarios, ranging from casino gambling to prescription drug testing, to bring statistical principles to life
 Discover how to measure spread, calculate odds through probability, and understand the normal, binomial, geometric, and Poisson distributions
 Conduct sampling, use correlation and regression, do hypothesis testing, perform chi square analysis, and more
Before you know it, you'll not only have mastered statistics, you'll also see how they work in the real world. Head First Statistics will help you pass your statistics course, and give you a firm understanding of the subject so you can apply the knowledge throughout your life.
Table of contents
 Dedication
 A Note Regarding Supplemental Files
 Advance Praise for Head First Statistics
 Praise for other Head First books
 Author of Head First Statistics
 How to use this Book: Intro

1. Visualizing Information: First Impressions
 Statistics are everywhere
 But why learn statistics?
 A tale of two charts
 Manic Mango needs some charts
 The humble pie chart
 Chart failure
 Bar charts can allow for more accuracy
 Vertical bar charts
 Horizontal bar charts
 It’s a matter of scale
 Using frequency scales
 Dealing with multiple sets of data
 Your bar charts rock
 Categories vs. numbers
 Dealing with grouped data
 To make a histogram, start by finding bar widths
 Manic Mango needs another chart
 Make the area of histogram bars proportional to frequency
 Step 1: Find the bar widths
 Step 2: Find the bar heights
 Step 3: Draw your chart—a histogram
 Histograms can’t do everything
 Introducing cumulative frequency
 Drawing the cumulative frequency graph
 Choosing the right chart
 Manic Mango conquered the games market!

2. Measuring Central Tendency: The Middle Way
 Welcome to the Health Club
 A common measure of average is the mean
 Mean math
 Dealing with unknowns
 Back to the mean
 Handling frequencies
 Back to the Health Club
 Everybody was Kung Fu fighting
 Our data has outliers
 The butler outliers did it
 Watercooler conversation
 Finding the median
 Business is booming
 The Little Ducklings swimming class
 Frequency Magnets
 Frequency Magnets
 What went wrong with the mean and median?
 Introducing the mode
 Congratulations!

3. Measuring Variability and Spread: Power Ranges
 Wanted: one player
 We need to compare player scores
 Use the range to differentiate between data sets
 The problem with outliers
 We need to get away from outliers
 Quartiles come to the rescue
 The interquartile range excludes outliers
 Quartile anatomy
 We’re not just limited to quartiles
 So what are percentiles?
 Box and whisker plots let you visualize ranges
 Variability is more than just spread
 Calculating average distances
 We can calculate variation with the variance...
 ...but standard deviation is a more intuitive measure
 A quicker calculation for variance
 What if we need a baseline for comparison?
 Use standard scores to compare values across data sets
 Interpreting standard scores
 Statsville All Stars win the league!

4. Calculating Probabilities: Taking Chances
 Fat Dan’s Grand Slam
 Roll up for roulette!
 Your very own roulette board
 Place your bets now!
 What are the chances?
 Find roulette probabilities
 You can visualize probabilities with a Venn diagram
 It’s time to play!
 And the winning number is...
 Let’s bet on an even more likely event
 You can also add probabilities
 You win!
 Time for another bet
 Exclusive events and intersecting events
 Problems at the intersection
 Some more notation
 Another unlucky spin...
 ...but it’s time for another bet
 Conditions apply
 Find conditional probabilities
 You can visualize conditional probabilities with a probability tree
 Trees also help you calculate conditional probabilities
 Bad luck!
 We can find P(Black l Even) using the probabilities we already have
 Step 1: Finding P(Black ∩ Even)
 So where does this get us?
 Step 2: Finding P(Even)
 Step 3: Finding P(Black l Even)
 These results can be generalized to other problems
 Use the Law of Total Probability to find P(B)
 Introducing Bayes’ Theorem
 We have a winner!
 It’s time for one last bet
 If events affect each other, they are dependent
 If events do not affect each other, they are independent
 More on calculating probability for independent events
 Winner! Winner!

5. Using Discrete Probability Distributions: Manage Your Expectations
 Back at Fat Dan’s Casino
 We can compose a probability distribution for the slot machine
 Expectation gives you a prediction of the results...
 ... and variance tells you about the spread of the results
 Variances and probability distributions
 Let’s calculate the slot machine’s variance
 Fat Dan changed his prices
 There’s a linear relationship between E(X) and E(Y)
 Slot machine transformations
 General formulas for linear transforms
 Every pull of the lever is an independent observation
 Observation shortcuts
 New slot machine on the block
 Add E(X) and E(Y) to get E(X + Y)...
 ... and subtract E(X) and E(Y) to get E(X – Y)
 You can also add and subtract linear transformations
 Jackpot!

6. Permutations and Combinations: Making Arrangements
 The Statsville Derby
 It’s a threehorse race
 How many ways can they cross the finish line?
 Calculate the number of arrangements
 Going round in circles
 It’s time for the novelty race
 Arranging by individuals is different than arranging by type
 We need to arrange animals by type
 Generalize a formula for arranging duplicates
 It’s time for the twentyhorse race
 How many ways can we fill the top three positions?
 Examining permutations
 What if horse order doesn’t matter
 Examining combinations
 It’s the end of the race

7. Geometric, Binomial, and Poisson Distributions: Keeping Things Discrete
 Meet Chad, the hapless snowboarder
 We need to find Chad’s probability distribution
 There’s a pattern to this probability distribution
 The probability distribution can be represented algebraically
 The pattern of expectations for the geometric distribution
 Expectation is 1/p
 Finding the variance for our distribution
 You’ve mastered the geometric distribution
 Should you play, or walk away?
 Generalizing the probability for three questions
 Let’s generalize the probability further
 What’s the expectation and variance?
 Binomial expectation and variance
 The Statsville Cinema has a problem
 Expectation and variance for the Poisson distribution
 So what’s the probability distribution?
 Combine Poisson variables
 The Poisson in disguise
 Anyone for popcorn?

8. Using the Normal Distribution: Being Normal
 Discrete data takes exact values...
 ... but not all numeric data is discrete
 What’s the delay?
 We need a probability distribution for continuous data
 Probability density functions can be used for continuous data
 Probability = area
 To calculate probability, start by finding f(x)...
 ... then find probability by finding the area
 We’ve found the probability
 Searching for a soul sole mate
 Male modelling
 The normal distribution is an “ideal” model for continuous data
 So how do we find normal probabilities?
 Three steps to calculating normal probabilities
 Step 1: Determine your distribution
 Step 2: Standardize to N(0, 1)
 To standardize, first move the mean...
 ... then squash the width
 Now find Z for the specific value you want to find probability for
 Step 3: Look up the probability in your handy table
 Julie’s probability is in the table
 And they all lived happily ever after

9. Using the Normal Distribution ii: Beyond Normal
 Love is a roller coaster
 All aboard the Love Train
 Normal bride + normal groom
 It’s still just weight
 How’s the combined weight distributed?
 Finding probabilities
 More people want the Love Train
 Linear transforms describe underlying changes in values...
 ...and independent observations describe how many values you have
 Expectation and variance for independent observations
 Should we play, or walk away?
 Normal distribution to the rescue
 When to approximate the binomial distribution with the normal
 Revisiting the normal approximation
 The binomial is discrete, but the normal is continuous
 Apply a continuity correction before calculating the approximation
 All aboard the Love Train
 When to approximate the binomial distribution with the normal
 A runaway success!

10. Using Statistical Sampling: Taking Samples
 The Mighty Gumball taste test
 They’re running out of gumballs
 Test a gumball sample, not the whole gumball population
 How sampling works
 When sampling goes wrong
 How to design a sample
 Define your sampling frame
 Sometimes samples can be biased
 Sources of bias
 How to choose your sample
 Simple random sampling
 How to choose a simple random sample
 There are other types of sampling
 We can use stratified sampling...
 ...or we can use cluster sampling...
 ...or even systematic sampling
 Mighty Gumball has a sample

11. Estimating Populations and Samples: Making Predictions
 So how long does flavor really last for?
 Let’s start by estimating the population mean
 Point estimators can approximate population parameters
 Let’s estimate the population variance
 We need a different point estimator than sample variance
 Which formula’s which?
 Mighty Gumball has done more sampling
 It’s a question of proportion
 Buy your gumballs here!
 So how does this relate to sampling?
 The sampling distribution of proportions
 So what’s the expectation of Ps?
 And what’s the variance of Ps?
 Find the distribution of Ps
 Ps follows a normal distribution
 How many gumballs?
 We need probabilities for the sample mean
 The sampling distribution of the mean
 Find the expectation for X̄
 What about the the variance of X̄?
 So how is X̄ distributed?
 If n is large, X̄ can still be approximated by the normal distribution
 Using the central limit theorem
 Sampling saves the day!

12. Constructing Confidence Intervals: Guessing with Confidence
 Mighty Gumball is in trouble
 The problem with precision
 Introducing confidence intervals
 Four steps for finding confidence intervals
 Step 1: Choose your population statistic
 Step 2: Find its sampling distribution
 Point estimators to the rescue
 We’ve found the distribution for X̄
 Step 3: Decide on the level of confidence
 How to select an appropriate confidence level
 Step 4: Find the confidence limits
 Start by finding Z
 Rewrite the inequality in terms of μ
 Finally, find the value of X̄
 You’ve found the confidence interval
 Let’s summarize the steps
 Handy shortcuts for confidence intervals
 Just one more problem...
 Step 1: Choose your population statistic
 Step 2: Find its sampling distribution
 X̄ follows the tdistribution when the sample is small
 Find the standard score for the tdistribution
 Step 3: Decide on the level of confidence
 Step 4: Find the confidence limits
 Using tdistribution probability tables
 The tdistribution vs. the normal distribution
 You’ve found the confidence intervals!

13. Using Hypothesis Tests: Look At The Evidence
 Statsville’s new miracle drug
 So what’s the problem?
 Resolving the conflict from 50,000 feet
 The six steps for hypothesis testing
 Step 1: Decide on the hypothesis
 So what’s the alternative?
 Step 2: Choose your test statistic
 Step 3: Determine the critical region
 To find the critical region, first decide on the significance level
 Step 4: Find the pvalue
 We’ve found the pvalue
 Step 5: Is the sample result in the critical region?
 Step 6: Make your decision
 So what did we just do?
 What if the sample size is larger?
 Let’s conduct another hypothesis test
 Step 1: Decide on the hypotheses
 Step 2: Choose the test statistic
 Use the normal to approximate the binomial in our test statistic
 Step 3: Find the critical region
 SnoreCull failed the test
 Mistakes can happen
 Let’s start with Type I errors
 What about Type II errors?
 Finding errors for SnoreCull
 We need to find the range of values
 Find P(Type II error)
 Introducing power
 The doctor’s happy

14. The χ2 Distribution: There’s Something Going On...
 There may be trouble ahead at Fat Dan’s Casino
 Let’s start with the slot machines
 The χ2 test assesses difference
 So what does the test statistic represent?
 Two main uses of the χ2 distribution
 v represents degrees of freedom
 What’s the significance?
 Hypothesis testing with χ2
 You’ve solved the slot machine mystery
 Fat Dan has another problem
 the χ2 distribution can test for independence
 You can find the expected frequencies using probability
 So what are the frequencies?
 We still need to calculate degrees of freedom
 Generalizing the degrees of freedom
 And the formula is...
 You’ve saved the casino

15. Correlation and Regression: What’s My Line?
 Never trust the weather
 Let’s analyze sunshine and attendance
 Exploring types of data
 Visualizing bivariate data
 Scatter diagrams show you patterns
 Correlation vs. causation
 Predict values with a line of best fit
 Your best guess is still a guess
 We need to minimize the errors
 Introducing the sum of squared errors
 Find the equation for the line of best fit
 Finding the slope for the line of best fit
 Finding the slope for the line of best fit, part ii
 We’ve found b, but what about a?
 You’ve made the connection
 Let’s look at some correlations
 The correlation coefficient measures how well the line fits the data
 There’s a formula for calculating the correlation coefficient, r
 Find r for the concert data
 Find r for the concert data, continued
 You’ve saved the day!
 Leaving town...
 It’s been great having you here in Statsville!

A. Leftovers: The Top Ten Things (we didn’t cover)
 #1. Other ways of presenting data
 #2. Distribution anatomy
 #3. Experiments
 Designing your experiment
 #4. Least square regression alternate notation
 #5. The coefficient of determination
 #6. Nonlinear relationships
 #7. The confidence interval for the slope of a regression line
 #8. Sampling distributions – the difference between two means
 #9. Sampling distributions – the difference between two proportions
 #10. E(X) and Var(X) for continuous probability distributions
 Finding E(X)
 Finding Var(X)
 B. Statistics Tables: Looking Things Up
 Index
 About the Author
 Copyright
Product information
 Title: Head First Statistics
 Author(s):
 Release date: August 2008
 Publisher(s): O'Reilly Media, Inc.
 ISBN: 9780596527587
You might also like
book
Designing DataIntensive Applications
Data is at the center of many challenges in system design today. Difficult issues need to …
book
Fundamentals of Data Engineering
Data engineering has grown rapidly in the past decade, leaving many software engineers, data scientists, and …
book
PyTorch Pocket Reference
This concise, easytouse reference puts one of the most popular frameworks for deep learning research and …
book
Kubernetes in Action
Kubernetes in Action teaches you to use Kubernetes to deploy containerbased distributed applications. You'll start with …