Chapter 1. Categorical Analysis
Categorical analysis is the foundation of data visualization. It is the first and most frequent type of data visualization that data analysts use. Categorical analysis takes a dimension (for example, [Regions]) and breaks it apart by a measure (for example, [Sales]). A dimension is typically a categorical value; these do not get aggregated. They are likely used to create data headers or to generate filters. A measure is a (usually numerical) value that can be aggregated using mathematical functions (like sum, average, or median). Measures create unbroken axes, those that extend from one end of a range to the other.
This type of analysis aids in answering common business questions such as these:
-
How does A compare to B?
-
How is X measure distributed across Y categories?
-
How much do A, B, and C contribute to the total?
-
How does X measure change over time (where time is the dimension)?
Categorical analysis is usually presented as bar charts. Bar charts use height or length as visual encoding to express a measure. Visual encoding refers to the techniques used to display data in charts; Figure 1-1 shows some examples. Encoding data in bar charts is effective because humans can quickly analyze the variation among the size of the bars; they are also easy to understand and label.
Figure 1-1. This illustration shows the various ways that data can ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access