O'Reilly logo

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Statistical Analysis: Microsoft Excel 2016

Book Description


Nationally recognized Excel expert Conrad Carlberg shows you how to use Excel 2016 to perform core statistical tasks every business professional, student, and researcher should master. Using real-world examples and downloadable workbooks, Carlberg helps you choose the right technique for each problem and get the most out of Excel’s statistical features. Along the way, he clarifies confusing statistical terminology and helps you avoid common mistakes.

You’ll learn how to use correlation and regression, analyze variance and covariance, and test statistical hypotheses using the normal, binomial, t, and F distributions. To help you make accurate inferences based on samples from a population, Carlberg offers insightful coverage of crucial topics ranging from experimental design to the statistical power of F tests. Updated for Excel 2016, this guide covers both modern consistency functions and legacy compatibility functions.

Becoming an expert with Excel statistics has never been easier! In this book, you’ll find crystal-clear instructions, insider insights, and complete step-by-step guidance.

  • Master Excel’s most useful descriptive and inferential statistical tools
  • Understand how values cluster together or disperse, and how variables move or classify jointly
  • Tell the truth with statistics—and recognize when others don’t
  • Infer a population’s characteristics from a sample’s frequency distribution
  • Explore correlation and regression to learn how variables move in tandem
  • Use Excel consistency functions such as STDEV.S( ) and STDEV.P( )
  • Test differences between two means using z tests, t tests, and Excel’s Data Analysis Add-in
  • Identify skewed distributions using Excel’s new built-in box-and-whisker plots and histograms
  • Evaluate statistical power and control risk
  • Explore how randomized block and split plot designs alter the derivation of F-ratios
  • Use coded multiple regression analysis to perform ANOVA with unbalanced factorial designs
  • Analyze covariance with ANCOVA, and properly use multiple covariance
  • Take advantage of Recommended PivotTables, Quick Analysis, and other Excel 2016 shortcuts

Table of Contents

  1. Cover Page
  2. Title Page
  3. Copyright Page
  4. Contents at a Glance
  5. Contents
  6. About This E-Book
  7. About the author
  8. Dedication
  9. Acknowledgments
  10. We Want to Hear from You!
  11. Reader Services
  12. Introduction
    1. Using Excel for Statistical Analysis
      1. About You and About Excel
      2. Clearing Up the Terms
      3. Making Things Easier
      4. The Wrong Box?
      5. Wagging the Dog
    2. What’s in This Book
  13. 1 About Variables and Values
    1. Variables and Values
      1. Recording Data in Lists
      2. Making Use of Lists
    2. Scales of Measurement
      1. Category Scales
      2. Numeric Scales
      3. Telling an Interval Value from a Text Value
    3. Charting Numeric Variables in Excel
      1. Charting Two Variables
    4. Understanding Frequency Distributions
      1. Using Frequency Distributions
      2. Building a Frequency Distribution from a Sample
      3. Building Simulated Frequency Distributions
  14. 2 How Values Cluster Together
    1. Calculating the Mean
      1. Understanding Functions, Arguments, and Results
      2. Understanding Formulas, Results, and Formats
      3. Minimizing the Spread
    2. Calculating the Median
      1. Choosing to Use the Median
      2. Static or Robust?
    3. Calculating the Mode
      1. Getting the Mode of Categories with a Formula
    4. From Central Tendency to Variability
  15. 3 Variability: How Values Disperse
    1. Measuring Variability with the Range
      1. Sample Size and the Range
      2. Variations on the Range
    2. The Concept of a Standard Deviation
      1. Arranging for a Standard
      2. Thinking in Terms of Standard Deviations
    3. Calculating the Standard Deviation and Variance
      1. Squaring the Deviations
      2. Population Parameters and Sample Statistics
      3. Dividing by N – 1
  16. Bias in the Estimate and Degrees of Freedom
    1. Excel’s Variability Functions
      1. Standard Deviation Functions
      2. Variance Functions
  17. 4 How Variables Move Jointly: Correlation
    1. Understanding Correlation
      1. The Correlation, Calculated
      2. Using the CORREL() Function
      3. Using the Analysis Tools
      4. Using the Correlation Tool
      5. Correlation Isn’t Causation
    2. Using Correlation
      1. Removing the Effects of the Scale
      2. Using the Excel Function
      3. Getting the Predicted Values
      4. Getting the Regression Formula
    3. Using TREND() for Multiple Regression
      1. Combining the Predictors
      2. Understanding “Best Combination”
      3. Understanding Shared Variance
      4. A Technical Note: Matrix Algebra and Multiple Regression in Excel
  18. 5 Charting Statistics
    1. Characteristics of Excel Charts
      1. Chart Axes
      2. Date Variables on Category Axes
      3. Other Numeric Variables on a Category Axis
    2. Histogram Charts
      1. Using a Pivot Table to Count the Records
      2. Using Advanced Filter and FREQUENCY()
      3. The Data Analysis Add-in’s Histogram
      4. The Built-in Histogram
      5. Data Series Addresses
    3. Box-and-Whisker Plots
      1. Managing Outliers
      2. Diagnosing Asymmetry
      3. Comparing Distributions
  19. 6 How Variables Classify Jointly: Contingency Tables
    1. Understanding One-Way Pivot Tables
      1. Running the Statistical Test
    2. Making Assumptions
      1. Random Selection
      2. Independent Selections
      3. The Binomial Distribution Formula
      4. Using the BINOM.INV() Function
    3. Understanding Two-Way Pivot Tables
      1. Probabilities and Independent Events
      2. Testing the Independence of Classifications
      3. About Logistic Regression
    4. The Yule Simpson Effect
    5. Summarizing the Chi-Square Functions
      1. Using CHISQ.DIST()
      2. Using CHISQ.DIST.RT() and CHIDIST()
      3. Using CHISQ.INV()
      4. Using CHISQ.INV.RT() and CHIINV()
      5. Using CHISQ.TEST() and CHITEST()
      6. Using Mixed and Absolute References to Calculate Expected Frequencies
      7. Using the Pivot Table’s Index Display
  20. 7 Using Excel with the Normal Distribution
    1. About the Normal Distribution
      1. Characteristics of the Normal Distribution
      2. The Unit Normal Distribution
    2. Excel Functions for the Normal Distribution
      1. The NORM.DIST( ) Function
      2. The NORM.INV( ) Function
    3. Confidence Intervals and the Normal Distribution
      1. The Meaning of a Confidence Interval
      2. Constructing a Confidence Interval
      3. Excel Worksheet Functions That Calculate Confidence Intervals
      4. Using CONFIDENCE.NORM( ) and CONFIDENCE( )
      5. Using CONFIDENCE.T( )
      6. Using the Data Analysis Add-In for Confidence Intervals
      7. Confidence Intervals and Hypothesis Testing
    4. The Central Limit Theorem
      1. Dealing with a Pivot Table Idiosyncrasy
      2. Making Things Easier
      3. Making Things Better
  21. 8 Telling the Truth with Statistics
    1. A Context for Inferential Statistics
      1. Establishing Internal Validity
      2. Threats to Internal Validity
    2. Problems with Excel’s Documentation
    3. The F-Test Two-Sample for Variances
      1. Why Run the Test?
    4. Reproducibility
    5. A Final Point
  22. 9 Testing Differences Between Means: The Basics
    1. Testing Means: The Rationale
      1. Using a z-Test
      2. Using the Standard Error of the Mean
      3. Creating the Charts
    2. Using the t-Test Instead of the z-Test
      1. Defining the Decision Rule
      2. Understanding Statistical Power
  23. 10 Testing Differences Between Means: Further Issues
    1. Using Excel’s T.DIST() and T.INV() Functions to Test Hypotheses
      1. Making Directional and Nondirectional Hypotheses
      2. Using Hypotheses to Guide Excel’s t-Distribution Functions
      3. Completing the Picture with T.DIST()
    2. Using the T.TEST() Function
      1. Degrees of Freedom in Excel Functions
      2. Equal and Unequal Group Sizes
      3. The T.TEST() Syntax
    3. Using the Data Analysis Add-in t-Tests
      1. Group Variances in t-Tests
      2. Visualizing Statistical Power
      3. When to Avoid t-Tests
  24. 11 Testing Differences Between Means: The Analysis of Variance
    1. Why Not t-Tests?
    2. The Logic of ANOVA
      1. Partitioning the Scores
      2. Comparing Variances
      3. The F-Test
    3. Using Excel’s F Worksheet Functions
      1. Using F.DIST() and F.DIST.RT()
      2. Using F.INV() and FINV()
      3. The F-Distribution
    4. Unequal Group Sizes
    5. Multiple Comparison Procedures
      1. The Scheffé Procedure
      2. Planned Orthogonal Contrasts
  25. 12 Analysis of Variance: Further Issues
    1. Factorial ANOVA
      1. Other Rationales for Multiple Factors
      2. Using the Two-Factor ANOVA Tool
    2. The Meaning of Interaction
      1. The Statistical Significance of an Interaction
      2. Calculating the Interaction Effect
    3. The Problem of Unequal Group Sizes
      1. Repeated Measures: The Two Factor Without Replication Tool
    4. Excel’s Functions and Tools: Limitations and Solutions
      1. Mixed Models
      2. Power of the F-Test
  26. 13 Experimental Design and ANOVA
    1. Crossed Factors and Nested Factors
      1. Depicting the Design Accurately
      2. Nuisance Factors
    2. Fixed Factors and Random Factors
      1. The Data Analysis Add-In’s ANOVA Tools
      2. Data Layout
    3. Calculating the F Ratios
      1. Adapting the Data Analysis Tool for a Random Factor
      2. Designing the F-Test
      3. The Mixed Model: Choosing the Denominator
      4. Adapting the Data Analysis Tool for a Nested Factor
      5. Data Layout for a Nested Design
      6. Getting the Sums of Squares
      7. Calculating the F Ratio for the Nesting Factor
    4. Randomized Block Designs
      1. Interaction Between Factors and Blocks
      2. Tukey’s Test for Nonadditivity
      3. Increasing Statistical Power
      4. Blocks as Fixed or Random
    5. Split-Plot Factorial Designs
      1. Assembling a Split-Plot Factorial Design
      2. Analysis of the Split-Plot Factorial Design
  27. 14 Statistical Power
    1. Controlling the Risk
      1. Directional and Nondirectional Hypotheses
      2. Changing the Sample Size
      3. Visualizing Statistical Power
    2. The Statistical Power of t-Tests
      1. Nondirectional Hypotheses
      2. Making a Directional Hypothesis
      3. Increasing the Size of the Samples
      4. The Dependent Groups t-Test
    3. The Noncentrality Parameter in the F-Distribution
      1. Variance Estimates
      2. The Noncentrality Parameter and the Probability Density Function
    4. Calculating the Power of the F-Test
      1. Calculating the Cumulative Density Function
      2. Using Power to Determine Sample Size
  28. 15 Multiple Regression Analysis and Effect Coding: The Basics
    1. Multiple Regression and ANOVA
      1. Using Effect Coding
      2. Effect Coding: General Principles
      3. Other Types of Coding
    2. Multiple Regression and Proportions of Variance
      1. Understanding the Segue from ANOVA to Regression
      2. The Meaning of Effect Coding
    3. Assigning Effect Codes in Excel
    4. Using Excel’s Regression Tool with Unequal Group Sizes
    5. Effect Coding, Regression, and Factorial Designs in Excel
      1. Exerting Statistical Control with Semipartial Correlations
      2. Using a Squared Semipartial to Get the Correct Sum of Squares
    6. Using TREND() to Replace Squared Semipartial Correlations
      1. Working with the Residuals
      2. Using Excel’s Absolute and Relative Addressing to Extend the Semipartials
  29. 16 Multiple Regression Analysis and Effect Coding: Further Issues
    1. Solving Unbalanced Factorial Designs Using Multiple Regression
      1. Variables Are Uncorrelated in a Balanced Design
      2. Variables Are Correlated in an Unbalanced Design
      3. Order of Entry Is Irrelevant in the Balanced Design
      4. Order Entry Is Important in the Unbalanced Design
      5. Proportions of Variance Can Fluctuate
    2. Experimental Designs, Observational Studies, and Correlation
    3. Using All the LINEST() Statistics
    4. Looking Inside LINEST()
      1. Understanding How LINEST() Calculates Its Results
      2. Getting the Regression Coefficients
      3. Getting the Sum of Squares Regression and Residual
      4. Calculating the Regression Diagnostics
      5. Understanding How LINEST() Handles Multicollinearity
      6. Forcing a Zero Constant
      7. The Excel 2007 Version
      8. A Negative R2?
    5. Managing Unequal Group Sizes in a True Experiment
    6. Managing Unequal Group Sizes in Observational Research
  30. 17 Analysis of Covariance: The Basics
    1. The Purposes of ANCOVA
      1. Greater Power
      2. Bias Reduction
    2. Using ANCOVA to Increase Statistical Power
      1. ANOVA Finds No Significant Mean Difference
      2. Adding a Covariate to the Analysis
    3. Testing for a Common Regression Line
    4. Removing Bias: A Different Outcome
  31. 18 Analysis of Covariance: Further Issues
    1. Adjusting Means with LINEST() and Effect Coding
    2. Effect Coding and Adjusted Group Means
    3. Multiple Comparisons Following ANCOVA
      1. Using the Scheffé Method
      2. Using Planned Contrasts
    4. The Analysis of Multiple Covariance
      1. The Decision to Use Multiple Covariates
      2. Two Covariates: An Example
    5. When Not to Use ANCOVA
      1. Intact Groups
      2. Extrapolation
  32. Index