Tables allow us to look at individual observations or summaries, whereas graphs present the data visually replacing numbers with graphical elements. Tables are important when the actual data values are important to show. Graphs enable us to visually identify trends, ranges, frequency distributions, relationships, outliers and make comparisons. There are many ways of visualizing information in the form of a graph. This section will describe some of the common graphs used in exploratory data analysis and data mining: frequency polygrams, histograms, scatterplots, and box plots. In addition, looking at multiple graphs simultaneously and viewing common subsets can offer new insights into the whole data set.
Frequency polygrams plot information according to the number of observations reported for each value (or ranges of values) for a particular variable. An example of a frequency polygram is shown in Figure 4.1. In this example, a variable (Model Year) is plotted. The number of observations for each year is counted and plotted. The shape of the plot reveals trends, that is, the number of observations each year fluctuates within a narrow range of around 25–40.
In Figure 4.2, a continuous variable (Displacement) is divided into ranges from 50 to 100, from 100 to 150, and so on. The number of values for each range is plotted and the shape indicates that most of the observations are for low displacement values.