Advancing into Analytics

Book description

Data analytics may seem daunting, but if you’re familiar with Excel, you have a head start that can help you make the leap into analytics. Advancing into Analytics will lower your learning curve.

Author George Mount, founder and CEO of Stringfest Analytics, clearly and gently guides intermediate Excel users to a solid understanding of analytics and the data stack. This book demonstrates key statistical concepts from spreadsheets and pivots your existing knowledge about data manipulation into R and Python programming.

With this practical book at your side, you’ll learn how to:

  • Explore a dataset for potential research questions to check assumptions and to build hypotheses
  • Make compelling business recommendations using inferential statistics
  • Load, view, and write datasets using R and Python
  • Perform common data wrangling tasks such as sorting, filtering, and aggregating using R and Python
  • Navigate and execute code in Jupyter notebooks
  • Identify, install, and implement the most useful open source packages for your needs
  • And more

Publisher resources

View/Submit Errata

Table of contents

  1. Preface
    1. Learning Objective
    2. Pre-requisites
      1. Technical requirements
      2. Technological requirements
    3. How I Got Here
    4. “Excel Bad, Coding Good”
    5. The Instructional Benefits of Excel
    6. Book Overview
      1. Part 1: Statistical foundations in Excel
      2. Part 2: Moving from Excel to R
      3. Part 3: Moving from Excel to Python
    7. End-of-Chapter Exercises
    8. This Is Not a Laundry List
    9. Don’t Panic
    10. Conventions Used in This Book
    11. Using Code Examples
    12. O’Reilly Online Learning
    13. How to Contact Us
  2. 1. Foundations of Exploratory Data Analysis
    1. What Is Exploratory Data Analysis?
      1. Observations
      2. Variables
      3. Categorical Variables
      4. Quantitative Variables
    2. Demonstration: Classifying Variables
      1. Recap: Variable types
    3. Exploring Variables in Excel
      1. Exploring categorical variables
      2. Exploring quantitative variables
    4. Conclusion
    5. Exercises
  3. 2. Foundations of Probability
    1. Probability and Randomness
    2. Probability and Sample Space
    3. Probability and Experiments
    4. Unconditional Probability
    5. Probability Distributions
      1. Discrete probability distributions
      2. Continuous probability distributions
    6. Conclusion
    7. Exercises
  4. 3. Foundations of Inferential Statistics
    1. The framework of statistical inference
      1. Collect a representative sample
      2. State the hypotheses
      3. Formulate an analysis plan
      4. Analyze the data
      5. Make a decision
    2. “It’s your world, the data’s only living in it”
    3. Conclusion
    4. Exercises
  5. 4. Correlation and Regression
    1. “Correlation Does Not Imply Causation”
    2. Introducing Correlation
    3. From Correlation to Regression
    4. Linear Regression in Excel
    5. Rethinking Our Results: Spurious Relationships
    6. Conclusion
    7. Advancing into Programming
    8. Exercises
  6. 5. The Data Analytics Stack
    1. Statistics versus Data Analytics versus Data Science
      1. Statistics
      2. Data analytics
      3. Data science
      4. Distinct, but not exclusive
    2. The Importance of the Data Analytics Stack
      1. Spreadsheets
      2. Databases
      3. Business intelligence platforms
      4. Data programming languages
    3. Recap: Thinking the Stack
    4. What’s Next
    5. Exercises
  7. 6. First Steps with R for Excel Users
    1. Downloading R
    2. Getting Started with RStudio
    3. Packages in R
    4. Upgrading R, RStudio, and R packages
    5. Exercises
  8. 7. Data Structures in R
    1. Vectors
    2. Indexing and subsetting vectors
    3. From Excel Tables to R Data Frames
    4. Importing data in R
      1. File paths and directories
    5. Exploring a data frame
      1. Indexing and subsetting data frames
    6. Writing data frames
    7. Conclusion
    8. Exercises
  9. 8. Data Manipulation and Visualization in R
    1. Data manipulation with dplyr
      1. Column-wise operations
      2. Row-wise operations
      3. Aggregating and joining data
      4. dplyr and the power of the pipe %>%
      5. Reshaping data with tidyr
    2. Data visualization with ggplot2
    3. Exercises
  10. 9. Capstone: R for Data Analytics
    1. Hypothesis testing
      1. Independent samples t-test
      2. Linear regression
      3. Train/test split
    2. Exercises
  11. 10. First Steps with Python for Excel Users
    1. Downloading Python
    2. Getting started with Jupyter
    3. Modules in Python
    4. Exercises
  12. 11. Data Structures in Python
    1. Introducing Pandas
    2. Importing Data in Python
      1. Indexing and subsetting DataFrames
    3. Writing data frames
    4. Exercises
  13. 12. Data Manipulation and Visualization in Python
    1. Column-wise operations
    2. Row-wise operations
    3. Aggregating and joining data
    4. Reshaping data
    5. Data visualization
    6. Exercises
  14. 13. Capstone: Python for Data Analytics
    1. Hypothesis testing
      1. Independent samples t-test
      2. Linear regression
      3. Train/test split
    2. Exercises
  15. 14. Conclusion and Next Steps
    1. Further Slices of the Stack
    2. Research Design and Business Experiments
    3. Further Statistical Methods
    4. Data Science and Machine Learning
    5. Version Control
    6. Go Forth and Data How You Please
    7. Parting Words

Product information

  • Title: Advancing into Analytics
  • Author(s): George Mount
  • Release date: October 2021
  • Publisher(s): O'Reilly Media, Inc.
  • ISBN: 9781492094326