O'Reilly logo

Making Sense of Data: A Practical Guide to Exploratory Data Analysis and Data Mining by Glenn J. Myatt

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

4.2 TABLES

4.2.1 Data Tables

The most common way of looking at data is through a table, where the raw data is displayed in familiar rows of observations and columns of variables. It is essential for reviewing the raw data; however, the table can be overwhelming with more than a handful of observations or variables. Sorting the table based on one or more variables is useful for organizing the data. It is virtually impossible to identify any trends or relationships looking at the raw data alone. An example of a table describing different cars is shown in Table 4.1.

4.2.2 Contingency Tables

Contingency tables (also referred to as two-way cross-classification tables) provide insight into the relationship between two variables. The variables must be categorical (dichotomous or discrete), or transformed to a categorical variable. A variable is often dichotomous; however, a contingency table can represent variables with more than two values. Table 4.2 describes the format for a contingency table where two variables are compared: Variable x and Variable y.

Table 4.1. Table of car records

images

  • Count+1: the number of observations where Variable x has “Value 1”, irrespective of the value of Variable y.
  • Count+2: the number of observations where Variable x has “Value 2”, irrespective of the value of Variable y.
  • Count1+: the number of observations where Variable y has “Value 1”, irrespective of the ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required