Advancing into Analytics

Book description

Data analytics may seem daunting, but if you're an experienced Excel user, you have a unique head start. With this hands-on guide, intermediate Excel users will gain a solid understanding of analytics and the data stack. By the time you complete this book, you'll be able to conduct exploratory data analysis and hypothesis testing using a programming language.

Exploring and testing relationships are core to analytics. By using the tools and frameworks in this book, you'll be well positioned to continue learning more advanced data analysis techniques. Author George Mount, founder and CEO of Stringfest Analytics, demonstrates key statistical concepts with spreadsheets, then pivots your existing knowledge about data manipulation into R and Python programming.

This practical book guides you through:

  • Foundations of analytics in Excel: Use Excel to test relationships between variables and build compelling demonstrations of important concepts in statistics and analytics
  • From Excel to R: Cleanly transfer what you've learned about working with data from Excel to R
  • From Excel to Python: Learn how to pivot your Excel data chops into Python and conduct a complete data analysis

Table of contents

  1. Preface
    1. Learning Objective
    2. Prerequisites
      1. Technical Requirements
      2. Technological Requirements
    3. How I Got Here
    4. “Excel Bad, Coding Good”
    5. The Instructional Benefits of Excel
    6. Book Overview
    7. End-of-Chapter Exercises
    8. This Is Not a Laundry List
    9. Don’t Panic
    10. Conventions Used in This Book
    11. Using Code Examples
    12. O’Reilly Online Learning
    13. How to Contact Us
    14. Acknowledgments
  2. I. Foundations of Analytics in Excel
  3. 1. Foundations of Exploratory Data Analysis
    1. What Is Exploratory Data Analysis?
      1. Observations
      2. Variables
    2. Demonstration: Classifying Variables
    3. Recap: Variable Types
    4. Exploring Variables in Excel
      1. Exploring Categorical Variables
      2. Exploring Quantitative Variables
    5. Conclusion
    6. Exercises
  4. 2. Foundations of Probability
    1. Probability and Randomness
    2. Probability and Sample Space
    3. Probability and Experiments
    4. Unconditional and Conditional Probability
    5. Probability Distributions
      1. Discrete Probability Distributions
      2. Continuous Probability Distributions
    6. Conclusion
    7. Exercises
  5. 3. Foundations of Inferential Statistics
    1. The Framework of Statistical Inference
      1. Collect a Representative Sample
      2. State the Hypotheses
      3. Formulate an Analysis Plan
      4. Analyze the Data
      5. Make a Decision
    2. It’s Your World…the Data’s Only Living in It
    3. Conclusion
    4. Exercises
  6. 4. Correlation and Regression
    1. “Correlation Does Not Imply Causation”
    2. Introducing Correlation
    3. From Correlation to Regression
    4. Linear Regression in Excel
    5. Rethinking Our Results: Spurious Relationships
    6. Conclusion
    7. Advancing into Programming
    8. Exercises
  7. 5. The Data Analytics Stack
    1. Statistics Versus Data Analytics Versus Data Science
      1. Statistics
      2. Data Analytics
      3. Business Analytics
      4. Data Science
      5. Machine Learning
      6. Distinct, but Not Exclusive
    2. The Importance of the Data Analytics Stack
      1. Spreadsheets
      2. Databases
      3. Business Intelligence Platforms
      4. Data Programming Languages
    3. Conclusion
    4. What’s Next
    5. Exercises
  8. II. From Excel to R
  9. 6. First Steps with R for Excel Users
    1. Downloading R
    2. Getting Started with RStudio
    3. Packages in R
    4. Upgrading R, RStudio, and R Packages
    5. Conclusion
    6. Exercises
  10. 7. Data Structures in R
    1. Vectors
    2. Indexing and Subsetting Vectors
    3. From Excel Tables to R Data Frames
    4. Importing Data in R
    5. Exploring a Data Frame
    6. Indexing and Subsetting Data Frames
    7. Writing Data Frames
    8. Conclusion
    9. Exercises
  11. 8. Data Manipulation and Visualization in R
    1. Data Manipulation with dplyr
      1. Column-Wise Operations
      2. Row-Wise Operations
      3. Aggregating and Joining Data
      4. dplyr and the Power of the Pipe (%>%)
      5. Reshaping Data with tidyr
    2. Data Visualization with ggplot2
    3. Conclusion
    4. Exercises
  12. 9. Capstone: R for Data Analytics
    1. Exploratory Data Analysis
    2. Hypothesis Testing
      1. Independent Samples t-test
      2. Linear Regression
      3. Train/Test Split and Validation
    3. Conclusion
    4. Exercises
  13. III. From Excel to Python
  14. 10. First Steps with Python for Excel Users
    1. Downloading Python
    2. Getting Started with Jupyter
    3. Modules in Python
    4. Upgrading Python, Anaconda, and Python packages
    5. Conclusion
    6. Exercises
  15. 11. Data Structures in Python
    1. NumPy arrays
    2. Indexing and Subsetting NumPy Arrays
    3. Introducing Pandas DataFrames
    4. Importing Data in Python
    5. Exploring a DataFrame
      1. Indexing and Subsetting DataFrames
      2. Writing DataFrames
    6. Conclusion
    7. Exercises
  16. 12. Data Manipulation and Visualization in Python
    1. Column-Wise Operations
    2. Row-Wise Operations
    3. Aggregating and Joining Data
    4. Reshaping Data
    5. Data Visualization
    6. Conclusion
    7. Exercises
  17. 13. Capstone: Python for Data Analytics
    1. Exploratory Data Analysis
    2. Hypothesis Testing
      1. Independent Samples T-test
      2. Linear Regression
      3. Train/Test Split and Validation
    3. Conclusion
    4. Exercises
  18. 14. Conclusion and Next Steps
    1. Further Slices of the Stack
    2. Research Design and Business Experiments
    3. Further Statistical Methods
    4. Data Science and Machine Learning
    5. Version Control
    6. Ethics
    7. Go Forth and Data How You Please
    8. Parting Words
  19. Index

Product information

  • Title: Advancing into Analytics
  • Author(s): George Mount
  • Release date: April 2021
  • Publisher(s): O'Reilly Media, Inc.
  • ISBN: 9781492094340