O'Reilly logo

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Predictive Analytics: Microsoft® Excel 2016, 2nd Edition

Book Description

EXCEL 2016 PREDICTIVE ANALYTICS FOR SERIOUS DATA CRUNCHERS!

Now, you can apply cutting-edge predictive analytics techniques to help your business win–and you don’t need multimillion-dollar software to do it. All the tools you need are available in Microsoft Excel 2016, and all the knowledge and skills are right here, in this book!

Microsoft Excel MVP Conrad Carlberg shows you how to use Excel predictive analytics to solve real problems in areas ranging from sales and marketing to operations. Carlberg offers unprecedented insight into building powerful, credible, and reliable forecasts, helping you gain deep insights from Excel that would be difficult to uncover with costly tools such as SAS or SPSS.

Fully updated for Excel 2016, this guide contains valuable new coverage of accounting for seasonality and managing complex consumer choice scenarios. Throughout, Carlberg provides downloadable Excel 2016 workbooks you can easily adapt to your own needs, plus VBA code–much of it open-source–to streamline especially complex techniques.

Step by step, you’ll build on Excel skills you already have, learning advanced techniques that can help you increase revenue, reduce costs, and improve productivity. By mastering predictive analytics, you’ll gain a powerful competitive advantage for your company and yourself.

Learn the “how” and “why” of using data to make better decisions, and choose the right technique for each problem

  • Capture live real-time data from diverse sources, including third-party websites
  • Use logistic regression to predict behaviors such as “will buy” versus “won’t buy”
  • Distinguish random data bounces from real, fundamental changes
  • Forecast time series with smoothing and regression
  • Account for trends and seasonality via Holt-Winters smoothing
  • Prevent trends from running out of control over long time horizons
  • Construct more accurate predictions by using Solver
  • Manage large numbers of variables and unwieldy datasets with principal components analysis and Varimax factor rotation
  • Apply ARIMA (Box-Jenkins) techniques to build better forecasts and clarify their meaning
  • Handle complex consumer choice problems with advanced logistic regression
  • Benchmark Excel results against R results

Table of Contents

  1. About This E-Book
  2. Title Page
  3. Contents at a Glance
  4. Copyright Page
  5. Contents
  6. Introduction to the 2013 Edition
    1. You, Analytics, and Excel
    2. Excel as a Platform
    3. What’s in This Book
  7. Introduction to this Edition
    1. Inside the Black Box
    2. Helping Out Your Colleagues
  8. 1 Building a Collector
    1. Planning an Approach
      1. A Meaningful Variable
      2. Identifying Sales
    2. Planning the Workbook Structure
      1. Query Sheets
      2. Summary Sheets
      3. Snapshot Formulas
      4. Customizing Your Formulas
    3. The VBA Code
      1. The DoItAgain Subroutine
      2. The DontRepeat Subroutine
      3. The PrepForAgain Subroutine
      4. The GetNewData Subroutine
      5. The GetRank Function
      6. The RefreshSheets Subroutine
    4. The Analysis Sheets
      1. Defining a Dynamic Range Name
      2. Using the Dynamic Range Name
  9. 2 Linear Regression
    1. Correlation and Regression
      1. Charting the Relationship
      2. Calculating Pearson’s Correlation Coefficient
    2. Correlation Is Not Causation
    3. Simple Regression
      1. Array-Entering Formulas
      2. Array-Entering LINEST( )
    4. Multiple Regression
      1. Creating the Composite Variable
      2. Entering LINEST( ) with Multiple Predictors
      3. Merging the Predictors
      4. Analyzing the Composite Variable
    5. Assumptions Made in Regression Analysis
      1. Variability
      2. Measures of Variability: Bartlett’s Test of Homogeneity of Variance
      3. Means of Residuals Are Zero
      4. Normally Distributed Forecasts
    6. Using Excel’s Regression Tool
      1. Accessing the Data Analysis Add-In
      2. Accessing an Installed Add-In
      3. Running the Regression Tool
      4. Understanding the Regression Tool’s Dialog Box
      5. Understanding the Regression Tool’s Output
  10. 3 Forecasting with Moving Averages
    1. About Moving Averages
      1. Signal and Noise
      2. Smoothing Out the Noise
      3. Lost Periods
      4. Smoothing Versus Tracking
      5. Weighted and Unweighted Moving Averages
      6. Total of Weights
      7. Relative Size of Weights
      8. More Recent Weights Are Larger
    2. Criteria for Judging Moving Averages
      1. Mean Absolute Deviation
      2. Least Squares
      3. Using Least Squares to Compare Moving Averages
    3. Getting Moving Averages Automatically
      1. Using the Moving Average Tool
      2. Labels
      3. Output Range
      4. Actuals and Forecasts
      5. Interpreting the Standard Errors—Or Failing to Do So
  11. 4 Forecasting a Time Series: Smoothing
    1. Exponential Smoothing: The Basic Idea
    2. Why “Exponential” Smoothing?
    3. Using Excel’s Exponential Smoothing Tool
      1. Understanding the Exponential Smoothing Dialog Box
    4. Choosing the Smoothing Constant
      1. Setting Up the Analysis
      2. Using Solver to Find the Best Smoothing Constant
      3. Understanding Solver’s Requirements
      4. The Point
    5. Handling Linear Baselines with Trend
      1. Characteristics of Trend
      2. First Differencing
  12. 5 More Advanced Smoothing Models
    1. Holt’s Linear Exponential Smoothing
      1. About Terminology and Symbols in Handling Trended Series
      2. Using Holt’s Linear Smoothing
      3. Holt’s Method and First Differences
    2. Seasonal Models
      1. Estimating Seasonal Indexes
      2. Estimating the Series Level and First Forecast
      3. Extending the Forecasts to Future Periods
      4. Finishing the One-Step-Ahead Forecasts
      5. Extending the Forecast Horizon
    3. Using Additive Holt-Winters Models
      1. Level
      2. Trend
      3. Season
    4. Formulas for the Holt-Winters Additive and Multiplicative Models
      1. Formulas for the Additive Model
      2. Formulas for the Multiplicative Model
    5. The Models Compared
    6. Damped Trend Forecasts
  13. 6 Forecasting a Time Series: Regression
    1. Forecasting with Regression
      1. Linear Regression: An Example
      2. Using the LINEST( ) Function
    2. Forecasting with Autoregression
      1. Problems with Trends
      2. Correlating at Increasing Lags
      3. A Review: Linear Regression and Autoregression
      4. Adjusting the Autocorrelation Formula
      5. Using ACFs
      6. Understanding PACFs
      7. Using the ARIMA Workbook
  14. 7 Logistic Regression: The Basics
    1. Traditional Approaches to the Analysis
      1. Z-tests and the Central Limit Theorem
      2. Sample Size and Observed Rate
      3. Binomial Distribution
      4. Only One Comparison
      5. Using Chi-Square
      6. Preferring Chi-Square to a Z-test
    2. Regression Analysis on Dichotomies
      1. Homoscedasticity
      2. Residuals Are Normally Distributed
      3. Restriction of Predicted Range
    3. Ah, But You Can Get Odds Forever
      1. Probabilities and Odds
      2. How the Probabilities Shift
      3. Moving On to the Log Odds
  15. 8 Logistic Regression: Further Issues
    1. An Example: Predicting Purchase Behavior
      1. Using Logistic Regression
      2. Calculation of Logit or Log Odds
    2. Comparing Excel with R: A Demonstration
      1. Getting R
      2. Running a Logistic Analysis in R
      3. Importing a csv File into R
      4. Importing From an Open Workbook Into R
      5. Understanding the Long Versus Wide Shape
      6. Running Logistic Regression Using glm
    3. Statistical Tests in Logistic Regression
      1. Models Comparison in Multiple Regression
      2. Calculating the Results of Different Models
      3. Testing the Difference Between the Models
      4. Models Comparison in Logistic Regression
  16. 9 Multinomial Logistic Regression
    1. The Multinomial Problem
    2. Three Alternatives and Three Predictors
      1. Three Intercepts and Three Sets of Coefficients
      2. Dummy Coding to Represent the Outcome Value
      3. Calculating the Logits
      4. Converting the Logits to Probabilities
      5. Calculating the Log Likelihoods
      6. Understanding the Differences Between the Binomial and Multinomial Equations
      7. Optimizing the Equations
    3. Benchmarking the Excel Results Against R
      1. Converting the Raw Data Frame with mlogit.data
      2. Calling the mlogit Function
      3. Completing the mlogit Arguments
    4. Four Outcomes and One Predictor
      1. Multinomial Analysis with an Individual-Specific Predictor
      2. Multinomial Analysis with an Alternative-Specific Predictor
  17. 10 Principal Components Analysis
    1. The Notion of a Principal Component
      1. Reducing Complexity
      2. Understanding Relationships Among Measurable Variables
      3. Maximizing Variance
      4. Components Are Mutually Orthogonal
    2. Using the Principal Components Add-In
      1. The R Matrix
      2. The Inverse of the R Matrix
      3. Matrices, Matrix Inverses, and Identity Matrices
      4. Features of the Correlation Matrix’s Inverse
      5. Matrix Inverses and Beta Coefficients
      6. Singular Matrices
      7. Testing for Uncorrelated Variables
      8. Using Eigenvalues
      9. Using Component Eigenvectors
      10. Factor Loadings
      11. Factor Score Coefficients
    3. Principal Components Distinguished from Factor Analysis
      1. Distinguishing the Purposes
      2. Distinguishing Unique from Shared Variance
      3. Rotating Axes
  18. 11 Box-Jenkins ARIMA Models
    1. The Rationale for ARIMA
      1. Deciding to Use ARIMA
      2. ARIMA Notation
    2. Stages in ARIMA Analysis
    3. The Identification Stage
      1. Identifying an AR Process
      2. Identifying an MA Process
      3. Differencing in ARIMA Analysis
      4. Using the ARIMA Workbook
      5. Standard Errors in Correlograms
      6. White Noise and Diagnostic Checking
      7. Identifying Seasonal Models
    4. The Estimation Stage
      1. Estimating the Parameters for ARIMA(1,0,0)
      2. Comparing Excel’s Results to R’s
      3. Exponential Smoothing and ARIMA(0,0,1)
      4. Using ARIMA(0,1,1) in Place of ARIMA(0,0,1)
    5. The Diagnostic and Forecasting Stages
  19. 12 Varimax Factor Rotation in Excel
    1. Getting to a Simple Structure
      1. Rotating Factors: The Rationale
      2. Extraction and Rotation: An Example
    2. Structure of Principal Components and Factors
      1. Rotating Factors: The Results
      2. Charting Records on Rotated Factors
      3. Using the Factor Workbook to Rotate Components
  20. Index
  21. Code Snippets