Hacks 1–7: Introduction

More than most other sports, baseball is a game of numbers. Everything that happens can be measured, and people try to measure everything that can happen. They’ve done this for a long, long time. If you are so inclined, you can find detailed descriptions of games played when Ulysses S. Grant was president of the United States. It is the sense of endless possibilities for exploration and discovery in this data, along with the sheer love of the game, that drives passionate fans to devote countless hours to the smallest details of baseball.

Many fans love the history of baseball and want to compare the top players of today to the top players of the past. Some fans like to look up specific games from many years ago, maybe looking for the play-by-play for Sandy Koufax’s perfect game or the box score for the first game their dad brought them to. Others wonder about how the game works, wanting to know if clutch hitters really exist, if sacrifice bunts are a good strategy, or if platooning batters is a good idea. Another group of fans just wants to know how their favorite players are doing this year, or they want to pick a winning fantasy team for next year.

This book will help you do these things, and it will tell you where to find the answers to questions like these. It will tell you where to find stats on past games, from wins and losses by teams all the way down to specific pitches. Many diligent fans from places like Retrosheet and Baseball DataBank work hard to collect historical data, and they make it available free of charge. This book shows you how to find and use their data, and how to analyze it with powerful and—thanks to the open source software movement—free tools that would have cost thousands of dollars just a few years ago.

Before diving headfirst into baseball statistics, databases, and programming, this chapter starts with the fundamentals: the rules of the game and of scoring a baseball game. I’ll also give a few tips for easily getting baseball information off the Internet.

Baseball 101

I think that most readers of this book will know the rules of baseball. Many readers will know them better than I do. But it didn’t seem right to dive into databases, formulas, and statistics without explaining the basics of the game. If you’re not familiar with baseball or you want a quick refresher, keep reading. If you know the rules, feel free to skip to the first hack.

Here’s a short description of how baseball works. At the game level, there are two competing teams. Each team has 25 players on its roster during most of the season, and 40 players in September. During the game, only nine players can be on the field at any time. In the National League, the same nine players bat and play defense. In the American League, the pitcher does not bat; he is replaced by a 10th player, called a designated hitter.

The game is divided into nine innings. During each inning, each team can bat until it makes three outs. The away team always bats first and the home team bats second. If the score is tied at the end of the ninth inning, the game will continue until one team gets the lead and holds it through the end of the inning. Incidentally, if the home team gets a lead during any inning when it can win the game at the end of the inning, the game ends. (Umpires can call games because of weather, mechanical difficulties, or other reasons; see the official rules for more about what can happen here.)

Each player bats in a specific order, cycling through players until the end of the game. The manager can choose the batting order before each game but cannot change the order after it has been set. A manager can make substitutions during the game, changing batters, runners, or fielders. However, once a player is removed from the game, he cannot reenter the game.

Each at bat is a little game in itself. The pitcher throws the ball to the batter, and the batter attempts to put the ball into play. The pitcher tries to throw the ball into the batter’s strike zone. The strike zone is a region over home plate, as wide as the plate and extending from the player’s knees to the letters on his jersey. A ball just needs to touch an edge of this zone to be considered a strike. If the pitch is thrown within the strike zone and the player does not swing at the pitch, this is a called strike. If the batter swings and misses, this is a swinging strike, even if the pitch is not thrown within the strike zone. Finally, if the pitch is thrown outside the strike zone and the player does not swing, this is called a ball. Each time a player comes to the plate to face a pitcher, a batter is allowed three strikes to put the ball in play, and the pitcher is allowed four balls to try to throw a strike.

If the batter makes contact with the ball, it must land within fair territory, or it is considered a foul ball. Fair territory extends from home plate down the third base line to a foul pole approximately 300 feet away (though usually farther), and down the first base line to a foul pole approximately 300 feet away. If a player has no strikes or one strike, a foul ball is considered a strike. If a player has two strikes, the number of strikes does not increase.

The defensive team can put out a player in several ways. First, on two strikes, the batter can strike out on a called strike or a swinging strike, giving the catcher a putout. Second, a player can catch a ball in the air (before it hits the ground), in fair or foul territory. Third, a defensive player can tag a runner with the ball. Finally, there are force plays. The batter is required to run to first base when a ball is put into play. If a base runner is on first base, this player is required to run as well. (In turn, batters on other bases are also required to run.) In these situations, a defensive player holding the ball can simply tag a base before a base runner does to put out that player.

A base runner scores by running around all four bases in order (first, second, third, and then home). Base runners can try to take bases whenever the ball is considered “alive.” When there is a putout from catching a ball, the base runners must return to their respective bases before attempting to advance. Under certain circumstances, base runners can move forward a certain number of bases. When the ball is hit beyond the outfield fence (between the foul poles), it’s a home run, and all base runners can advance to home plate (and score). When the ball bounces in fair territory and then out of the park, it’s considered a ground rule double, and base runners are allowed to advance two bases. (Players can advance under some other circumstances; see the official rules for more information.)

If you want to know more, you can find the official rules of Major League Baseball on the Web at http://mlb.mlb.com/NASApp/mlb/mlb/official_info/official_rules/foreword.jsp. Appendix B lists many of the common abbreviations for statistics.

Get Baseball Hacks now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.