book

Introductory Statistics and Analytics: A Resampling Perspective

Name: Introductory Statistics and Analytics: A Resampling Perspective
Author: Peter C. Bruce
ISBN: 9781118881354

by Peter C. Bruce

December 2014

Beginner

312 pages

8h 30m

English

Wiley

Read now

Unlock full access

Cover Page
Title Page
Copyright
Contents
PREFACE
BOOK WEBSITE
Acknowledgments
INTRODUCTION
IF YOU CAN'T MEASURE IT, YOU CAN'T MANAGE ITPHANTOM PROTECTION FROM VITAMIN ESTATISTICIAN, HEAL THYSELFIDENTIFYING TERRORISTS IN AIRPORTSLOOKING AHEAD IN THE BOOKRESAMPLINGBIG DATA AND STATISTICIANS
1: DESIGNING AND CARRYING OUT A STATISTICAL STUDY
1.1 A SMALL EXAMPLE1.2 IS CHANCE RESPONSIBLE? THE FOUNDATION OF HYPOTHESIS TESTING1.3 A MAJOR EXAMPLE1.4 DESIGNING AN EXPERIMENT1.5 WHAT TO MEASURE—CENTRAL LOCATION1.6 WHAT TO MEASURE—VARIABILITY1.7 WHAT TO MEASURE—DISTANCE (NEARNESS)1.8 TEST STATISTIC1.9 THE DATA1.10 VARIABLES AND THEIR FLAVORS1.11 EXAMINING AND DISPLAYING THE DATA1.12 ARE WE SURE WE MADE A DIFFERENCE?APPENDIX: HISTORICAL NOTE1.13 EXERCISES
2: STATISTICAL INFERENCE
2.1 REPEATING THE EXPERIMENT2.2 HOW MANY RESHUFFLES?2.3 HOW ODD IS ODD?2.4 STATISTICAL AND PRACTICAL SIGNIFICANCE2.5 WHEN TO USE HYPOTHESIS TESTS2.6 EXERCISES
3: DISPLAYING AND EXPLORING DATA
3.1 BAR CHARTS3.2 PIE CHARTS3.3 MISUSE OF GRAPHS3.4 INDEXING3.5 EXERCISES

4: PROBABILITY
4.1 MENDEL'S PEAS4.2 SIMPLE PROBABILITY4.3 RANDOM VARIABLES AND THEIR PROBABILITY DISTRIBUTIONS4.4 THE NORMAL DISTRIBUTION4.5 EXERCISES
5: RELATIONSHIP BETWEEN TWO CATEGORICAL VARIABLES
5.1 TWO-WAY TABLES5.2 COMPARING PROPORTIONS5.3 MORE PROBABILITY5.4 FROM CONDITIONAL PROBABILITIES TO BAYESIAN ESTIMATES5.5 INDEPENDENCE5.6 EXPLORATORY DATA ANALYSIS (EDA)5.7 EXERCISES
6: SURVEYS AND SAMPLING
6.1 SIMPLE RANDOM SAMPLES6.2 MARGIN OF ERROR: SAMPLING DISTRIBUTION FOR A PROPORTION6.3 SAMPLING DISTRIBUTION FOR A MEAN6.4 A SHORTCUT—THE BOOTSTRAP6.5 BEYOND SIMPLE RANDOM SAMPLING6.6 ABSOLUTE VERSUS RELATIVE SAMPLE SIZE6.7 EXERCISES
7: CONFIDENCE INTERVALS
7.1 POINT ESTIMATES7.2 INTERVAL ESTIMATES (CONFIDENCE INTERVALS)7.3 CONFIDENCE INTERVAL FOR A MEAN7.4 FORMULA-BASED COUNTERPARTS TO THE BOOTSTRAP7.5 STANDARD ERROR7.6 CONFIDENCE INTERVALS FOR A SINGLE PROPORTION7.7 CONFIDENCE INTERVAL FOR A DIFFERENCE IN MEANS7.8 CONFIDENCE INTERVAL FOR A DIFFERENCE IN PROPORTIONS7.9 RECAPPINGAPPENDIX A: MORE ON THE BOOTSTRAPRESAMPLING PROCEDURE—PARAMETRIC BOOTSTRAPFORMULAS AND THE PARAMETRIC BOOTSTRAPAPPENDIX B: ALTERNATIVE POPULATIONSAPPENDIX C: BINOMIAL FORMULA PROCEDURE7.10 EXERCISES
8: HYPOTHESIS TESTS
8.1 REVIEW OF TERMINOLOGY8.2 A–B TESTS: THE TWO SAMPLE COMPARISON8.3 COMPARING TWO MEANS8.4 COMPARING TWO PROPORTIONS8.5 FORMULA-BASED ALTERNATIVE— t -TEST FOR MEANS8.6 THE NULL AND ALTERNATIVE HYPOTHESES8.7 PAIRED COMPARISONSAPPENDIX A: CONFIDENCE INTERVALS VERSUS HYPOTHESIS TESTSCONFIDENCE INTERVALRELATIONSHIP BETWEEN THE HYPOTHESIS TEST AND THE CONFIDENCE INTERVALCOMMENTAPPENDIX B: FORMULA-BASED VARIATIONS OF TWO-SAMPLE TESTSZ -TEST WITH KNOWN POPULATION VARIANCEPOOLED VERSUS SEPARATE VARIANCESFORMULA-BASED ALTERNATIVE: Z -TEST FOR PROPORTIONS8.8 EXERCISES
9: HYPOTHESIS TESTING—2
9.1 A SINGLE PROPORTION9.2 A SINGLE MEAN9.3 MORE THAN TWO CATEGORIES OR SAMPLES9.4 CONTINUOUS DATA9.5 GOODNESS-OF-FITAPPENDIX: NORMAL APPROXIMATION; HYPOTHESIS TEST OF A SINGLE PROPORTIONCONFIDENCE INTERVAL FOR A MEAN9.6 EXERCISES
10: CORRELATION
10.1 EXAMPLE: DELTA WIRE10.2 EXAMPLE: COTTON DUST AND LUNG DISEASE10.3 THE VECTOR PRODUCT AND SUM TEST10.4 CORRELATION COEFFICIENT10.5 OTHER FORMS OF ASSOCIATION10.6 CORRELATION IS NOT CAUSATION10.7 EXERCISES
11: REGRESSION
11.1 FINDING THE REGRESSION LINE BY EYE11.2 FINDING THE REGRESSION LINE BY MINIMIZING RESIDUALS11.3 LINEAR RELATIONSHIPS11.4 INFERENCE FOR REGRESSION11.5 EXERCISES
12: ANALYSIS OF VARIANCE—ANOVA
12.1 COMPARING MORE THAN TWO GROUPS: ANOVA12.2 THE PROBLEM OF MULTIPLE INFERENCE12.3 A SINGLE TEST12.4 COMPONENTS OF VARIANCE12.5 TWO-WAY ANOVA12.6 FACTORIAL DESIGN12.7 EXERCISES
13: MULTIPLE REGRESSION
13.1 REGRESSION AS EXPLANATION13.2 SIMPLE LINEAR REGRESSION—EXPLORE THE DATA FIRST13.3 MORE INDEPENDENT VARIABLES13.4 MODEL ASSESSMENT AND INFERENCE13.5 ASSUMPTIONS13.6 INTERACTION, AGAIN13.7 REGRESSION FOR PREDICTION13.8 EXERCISES
INDEX

Content preview from Introductory Statistics and Analytics: A Resampling Perspective

2 STATISTICAL INFERENCE

The task of trying to assess the impact of random variability on the conclusion drawn from a study, or the results of a measurement, is called statistical inference. In this chapter, we look at a particular kind of statistical inference called a hypothesis test. Generally, a hypothesis test seeks to determine whether the effects we see in some data from a study are real or might just be the result of chance variation.

Who uses hypothesis testing? The research community uses it to determine whether a study is worthy of publication or regulatory approval. Data scientists are in less need of the formal apparatus of hypothesis testing but they do use the resampling methods presented here, and their variants, to help separate random from “real” patterns in data.

After completing this chapter, you should be able to

explain the concept of a null hypothesis,
describe how to conduct a permutation test with a hat and slips of paper,
interpret the results of a permutation test,
describe the shape of the Normal distribution and what is meant when it is said that a more accurate name is the “Error” distribution,
define, in the context of hypothesis testing, alpha, Type I error, and Type II error,
explain in what circumstances hypothesis testing is used.

The Null Hypothesis

The standard hypothesis-testing procedure involves a what-if calculation. We ask, “Could my results be due to chance?” This supposition is called the null hypothesis. Null is an old-fashioned word ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

O’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.

Julian F.

Head of Cybersecurity

I wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.

Addison B.

Field Engineer

I’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.

Amir M.

Data Platform Tech Lead

I'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.

Mark W.

Embedded Software Engineer

The Definitive Guide to Marketing Analytics and Metrics (Collection)

Publisher Resources

ISBN: 9781118881354Purchase book

Cloud Computing

Data Engineering

Data Science

AI & ML

Programming Languages

Software Architecture

IT/Ops

Security

Design

Business

Soft Skills

Introductory Statistics and Analytics: A Resampling Perspective

by Peter C. Bruce

2