2Observing Information

Our journey into data-driven prediction begins with some basic ideas. In this chapter, we set forth principles which may at first seem obvious, but which, upon deeper inspection, have profound implications. These ideas lay the foundation for everything that follows.

Observing Information Conceptually

Whenever we approach a new dataset the first order of business is to get our bearings. We have before us a series of observations, each of which is described by a set of attributes. The observations could be of people, described by attributes like age, health, education, salary, and place of residence. They could be times at-bat for a major league baseball player, with attributes of runs-batted-in, home runs, walks, strikeouts, weather conditions, and where the game took place. Or the observations could be periods of economic performance measured by attributes such as growth in output, inflation, interest rates, unemployment, stock market returns, and perhaps the political parties in power at the time. What matters is that we have a set of observations characterized by a consistent collection of attributes. A conventional statistics approach would have us focus on these attributes and refer to them as variables, but as we stated earlier, we ask that you indulge us as we focus mainly on how we observe these attributes.

We begin by summarizing the observations as averages. Throughout this book we will compute many averages. We use the average as a device ...

Get Prediction Revisited now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.