book

Football Analytics with Python & R

by Eric A. Eager, Richard A. Erickson

August 2023

Beginner to intermediate

352 pages

8h 40m

English

O'Reilly Media, Inc.

Read now

Unlock full access

Includes

Quizzes

Preface
Who This Book Is ForWho This Book Is Not ForHow We Think About Data and How to Use This BookA Football ExampleWhat You Will Learn from Our BookConventions Used in This BookUsing Code ExamplesO’Reilly Online LearningHow to Contact UsAcknowledgments
1. Football Analytics
Baseball Has the Three True Outcomes: Does Football?Do Running Backs Matter?How Data Can Help Us Contextualize Passing StatisticsCan You Beat the Odds?Do Teams Beat the Draft?Tools for Football AnalyticsFirst Steps in Python and RExample Data: Who Throws Deep?nflfastR in Rnfl_data_py in PythonData Science Tools Used in This ChapterSuggested Readings
2. Exploratory Data Analysis: Stable Versus Unstable Quarterback Statistics
Defining QuestionsObtaining and Filtering DataSummarizing DataPlotting DataHistogramsBoxplotsPlayer-Level Stability of Passing Yards per AttemptDeep Passes Versus Short PassesSo, What Should We Do with This Insight?Data Science Tools Used in This ChapterExercisesSuggested Readings
3. Simple Linear Regression: Rushing Yards Over Expected
Exploratory Data AnalysisSimple Linear RegressionWho Was the Best in RYOE?Is RYOE a Better Metric?Data Science Tools Used in This ChapterExercisesSuggested Readings
4. Multiple Regression: Rushing Yards Over Expected
Definition of Multiple Linear RegressionExploratory Data AnalysisApplying Multiple Linear RegressionAnalyzing RYOESo, Do Running Backs Matter?Assumption of LinearityData Science Tools Used in This ChapterExercisesSuggested Readings
5. Generalized Linear Models: Completion Percentage over Expected
Generalized Linear ModelsBuilding a GLMGLM Application to Completion PercentageIs CPOE More Stable Than Completion Percentage?A Question About Residual MetricsA Brief Primer on Odds RatiosData Science Tools Used in This ChapterExercisesSuggested Readings
6. Using Data Science for Sports Betting: Poisson Regression and Passing Touchdowns
The Main Markets in FootballApplication of Poisson Regression: Prop MarketsThe Poisson DistributionIndividual Player Markets and ModelingPoisson Regression CoefficientsClosing Thoughts on GLMsData Science Tools Used in This ChapterExercisesSuggested Readings
7. Web Scraping: Obtaining and Analyzing Draft Picks
Web Scraping with PythonWeb Scraping in RAnalyzing the NFL DraftThe Jets/Colts 2018 Trade EvaluatedAre Some Teams Better at Drafting Players Than Others?Data Science Tools Used in This ChapterExercisesSuggested Readings
8. Principal Component Analysis and Clustering: Player Attributes
Web Scraping and Visualizing NFL Scouting Combine DataIntroduction to PCAPCA on All DataClustering Combine DataClustering Combine Data in PythonClustering Combine Data in RClosing Thoughts on ClusteringData Science Tools Used in This ChapterExercisesSuggested Readings
9. Advanced Tools and Next Steps
Advanced Modeling ToolsTime Series AnalysisMultivariate Statistics Beyond PCAQuantile RegressionBayesian Statistics and Hierarchical ModelsSurvival Analysis/Time-to-EventBayesian Networks/Structural Equation ModelingMachine LearningCommand Line ToolsBash ExampleSuggested Readings for bashVersion ControlGitGitHub and GitLabGitHub Web Pages and RésumésSuggested Reading for GitStyle Guides and LintingPackagesSuggested Readings for PackagesComputer EnvironmentsInteractives and Report Tools to Share DataArtificial Intelligence ToolsConclusion

A. Python and R Basics
Obtaining Python and RLocal InstallationCloud-Based OptionsScriptsPackages in Python and RnflfastR and nfl_data_py TipsIntegrated Development EnvironmentsBasic Python Data TypesBasic R Data Types
B. Summary Statistics and Data Wrangling: Passing the Ball
Basic StatisticsAveragesVariability and DistributionUncertainty Around EstimatesFiltering and Selecting ColumnsCalculating Summary Statistics with Python and RA Note About Presenting Summary StatisticsImproving Your PresentationExercisesSuggested Readings
C. Data-Wrangling Fundamentals
Logic OperatorsFiltering and Sorting DataCleaningPiping in RChecking and Cleaning Data for OutliersMerging Multiple Datasets
Glossary
Index
About the Authors

Content preview from Football Analytics with Python & R

Chapter 1. Football Analytics

American football (also known as gridiron football or North American football and henceforth simply called football) is undergoing a drastic shift toward the quantitative. Prior to the last half of a decade or so, most of football analytics was confined to a few seminal pieces of work. Arguably the earliest example of analytics being used in football occurred when former Brigham Young University, Chicago Bears, Cincinnati Bengals, and San Diego Chargers quarterback Virgil Carter created the notion of an expected point as coauthor of the 1971 paper “Technical Note: Operations Research in Football” before he teamed with the legendary Bill Walsh as the first quarterback to execute what is now known as the West Coast offense.

The idea of an expected point is incredibly important in football, as the game by its very nature is discrete: a collection of a finite number of plays (also called downs) that require the offense to go a certain distance (in yards) before having to surrender the ball to the opposing team. If the line to gain is the opponent’s end zone, the offense scores a touchdown, which is worth, on average, seven points after a post-touchdown conversion. Hence, the expected point provides an estimated, or expected value for the number of points you would expect a team to score given the current game situation on that drive.

Football statistics have largely been confined to offensive players, and have been doled out in the currency of yards gained ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

O’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.

Julian F.

Head of Cybersecurity

I wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.

Addison B.

Field Engineer

I’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.

Amir M.

Data Platform Tech Lead

I'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.

Mark W.

Embedded Software Engineer

Pandas for Everyone: Python Data Analysis, 2nd Edition

Publisher Resources

ISBN: 9781492099611Errata Page

Cloud Computing

Data Engineering

Data Science

AI & ML

Programming Languages

Software Architecture

IT/Ops

Security

Design

Business

Soft Skills

Football Analytics with Python & R

by Eric A. Eager, Richard A. Erickson

Chapter 1. Football Analytics

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.