Hacks 8–23: Introduction
This chapter explains where to get information about baseball games and baseball players and tells you how to store this information for easy lookup. These hacks explain how to find or make box scores, player statistics, and even play-by-play data. You can get data about games from 1871 through last season. Several different groups of baseball fans have worked hard to compile and digitize this information, making it easy for any baseball fan to look up scores and stats, plan their fantasy team, or research baseball.
We use three primary types of data in this book. Here’s a short explanation of each one, and how they all fit together:
- Play-by-play
The most detailed data we have is “play-by-play” or “event” data. The event files include information about every play in a game: every at bat, every stolen base, and sometimes every pitch.
- Game logs
Game logs include a summary of each game: playing conditions, scores, and starting players.
- Player and team statistics
Player statistics include statistics for each player on each team in each season. These files include offensive statistics like hits, home runs, and stolen bases; pitching statistics like batters faced, strikeouts, and earned runs; and fielding statistics like putouts, assists, and errors. Team statistics summarize this information for each team.
When you have detailed play-by-play descriptions for each game, you can derive game logs from those descriptions. (See “Make Box Scores or Database Tables from Play-by-Play ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access