Football Analytics with Python & R

Book description

Baseball is not the only sport to use "moneyball." American football teams, fantasy football players, fans, and gamblers are increasingly using data to gain an edge on the competition. Professional and college teams use data to help identify team needs and select players to fill those needs. Fantasy football players and fans use data to try to defeat their friends, while sports bettors use data in an attempt to defeat the sportsbooks.

In this concise book, Eric Eager and Richard Erickson provide a clear introduction to using statistical models to analyze football data using both Python and R. Whether your goal is to qualify for an entry-level football analyst position, dominate your fantasy football league, or simply learn R and Python with fun example cases, this book is your starting place.

Through case studies in both Python and R, you'll learn to:

  • Obtain NFL data from Python and R packages and web scraping
  • Visualize and explore data
  • Apply regression models to play-by-play data
  • Extend regression models to classification problems in football
  • Apply data science to sports betting with individual player props
  • Understand player athletic attributes using multivariate statistics

Publisher resources

View/Submit Errata

Table of contents

  1. Preface
    1. Who This Book Is For
    2. Who This Book Is Not For
    3. How We Think About Data and How to Use This Book
    4. A Football Example
    5. What You Will Learn from Our Book
    6. Conventions Used in This Book
    7. Using Code Examples
    8. O’Reilly Online Learning
    9. How to Contact Us
    10. Acknowledgments
  2. 1. Football Analytics
    1. Baseball Has the Three True Outcomes: Does Football?
    2. Do Running Backs Matter?
    3. How Data Can Help Us Contextualize Passing Statistics
    4. Can You Beat the Odds?
    5. Do Teams Beat the Draft?
    6. Tools for Football Analytics
    7. First Steps in Python and R
    8. Example Data: Who Throws Deep?
      1. nflfastR in R
      2. nfl_data_py in Python
    9. Data Science Tools Used in This Chapter
    10. Suggested Readings
  3. 2. Exploratory Data Analysis: Stable Versus Unstable Quarterback Statistics
    1. Defining Questions
    2. Obtaining and Filtering Data
    3. Summarizing Data
    4. Plotting Data
      1. Histograms
      2. Boxplots
    5. Player-Level Stability of Passing Yards per Attempt
      1. Deep Passes Versus Short Passes
      2. So, What Should We Do with This Insight?
    6. Data Science Tools Used in This Chapter
    7. Exercises
    8. Suggested Readings
  4. 3. Simple Linear Regression: Rushing Yards Over Expected
    1. Exploratory Data Analysis
    2. Simple Linear Regression
    3. Who Was the Best in RYOE?
    4. Is RYOE a Better Metric?
    5. Data Science Tools Used in This Chapter
    6. Exercises
    7. Suggested Readings
  5. 4. Multiple Regression: Rushing Yards Over Expected
    1. Definition of Multiple Linear Regression
    2. Exploratory Data Analysis
    3. Applying Multiple Linear Regression
    4. Analyzing RYOE
    5. So, Do Running Backs Matter?
    6. Assumption of Linearity
    7. Data Science Tools Used in This Chapter
    8. Exercises
    9. Suggested Readings
  6. 5. Generalized Linear Models: Completion Percentage over Expected
    1. Generalized Linear Models
    2. Building a GLM
    3. GLM Application to Completion Percentage
    4. Is CPOE More Stable Than Completion Percentage?
    5. A Question About Residual Metrics
    6. A Brief Primer on Odds Ratios
    7. Data Science Tools Used in This Chapter
    8. Exercises
    9. Suggested Readings
  7. 6. Using Data Science for Sports Betting: Poisson Regression and Passing Touchdowns
    1. The Main Markets in Football
    2. Application of Poisson Regression: Prop Markets
    3. The Poisson Distribution
    4. Individual Player Markets and Modeling
    5. Poisson Regression Coefficients
    6. Closing Thoughts on GLMs
    7. Data Science Tools Used in This Chapter
    8. Exercises
    9. Suggested Readings
  8. 7. Web Scraping: Obtaining and Analyzing Draft Picks
    1. Web Scraping with Python
    2. Web Scraping in R
    3. Analyzing the NFL Draft
    4. The Jets/Colts 2018 Trade Evaluated
    5. Are Some Teams Better at Drafting Players Than Others?
    6. Data Science Tools Used in This Chapter
    7. Exercises
    8. Suggested Readings
  9. 8. Principal Component Analysis and Clustering: Player Attributes
    1. Web Scraping and Visualizing NFL Scouting Combine Data
    2. Introduction to PCA
    3. PCA on All Data
    4. Clustering Combine Data
      1. Clustering Combine Data in Python
      2. Clustering Combine Data in R
      3. Closing Thoughts on Clustering
    5. Data Science Tools Used in This Chapter
    6. Exercises
    7. Suggested Readings
  10. 9. Advanced Tools and Next Steps
    1. Advanced Modeling Tools
      1. Time Series Analysis
      2. Multivariate Statistics Beyond PCA
      3. Quantile Regression
      4. Bayesian Statistics and Hierarchical Models
      5. Survival Analysis/Time-to-Event
      6. Bayesian Networks/Structural Equation Modeling
      7. Machine Learning
    2. Command Line Tools
      1. Bash Example
      2. Suggested Readings for bash
    3. Version Control
      1. Git
      2. GitHub and GitLab
      3. GitHub Web Pages and Résumés
    4. Suggested Reading for Git
    5. Style Guides and Linting
    6. Packages
      1. Suggested Readings for Packages
    7. Computer Environments
    8. Interactives and Report Tools to Share Data
    9. Artificial Intelligence Tools
    10. Conclusion
  11. A. Python and R Basics
    1. Obtaining Python and R
    2. Local Installation
      1. Cloud-Based Options
    3. Scripts
    4. Packages in Python and R
    5. nflfastR and nfl_data_py Tips
    6. Integrated Development Environments
    7. Basic Python Data Types
    8. Basic R Data Types
  12. B. Summary Statistics and Data Wrangling: Passing the Ball
    1. Basic Statistics
      1. Averages
      2. Variability and Distribution
      3. Uncertainty Around Estimates
    2. Filtering and Selecting Columns
    3. Calculating Summary Statistics with Python and R
    4. A Note About Presenting Summary Statistics
    5. Improving Your Presentation
    6. Exercises
    7. Suggested Readings
  13. C. Data-Wrangling Fundamentals
    1. Logic Operators
    2. Filtering and Sorting Data
    3. Cleaning
    4. Piping in R
    5. Checking and Cleaning Data for Outliers
    6. Merging Multiple Datasets
  14. Glossary
  15. Index
  16. About the Authors

Product information

  • Title: Football Analytics with Python & R
  • Author(s): Eric A. Eager, Richard A. Erickson
  • Release date: August 2023
  • Publisher(s): O'Reilly Media, Inc.
  • ISBN: 9781492099628