Chapter 20. Redundant Coding

In Chapter 19, we saw that color cannot always convey information as effectively as we might wish. If we have many different items we want to identify, doing so by color may not work. It will be difficult to match the colors in the plot to the colors in the legend (Figure 19-1). And even if we only need to distinguish two or three different items, color may fail if the colored items are very small (Figure 19-11) and/or the colors look similar for people suffering from color-vision deficiency (Figures 19-7 and 19-8). The general solution in all these scenarios is to use color to enhance the visual appearance of the figure without relying entirely on color to convey key information. I refer to this design principle as redundant coding, because it prompts us to encode data redundantly, using multiple different aesthetic dimensions.

Designing Legends with Redundant Coding

Scatterplots of several groups of data are frequently designed such that the points representing different groups differ only in their color. As an example, consider Figure 20-1, which shows the sepal width versus the sepal length of three different Iris species. (Sepals are the outer leaves of flowers in flowering plants.) The points representing the different species differ in their colors, but otherwise all points look exactly the same. Even though this figure contains only three distinct groups of points, it is difficult to read even for people with normal color vision. The problem ...

Get Fundamentals of Data Visualization now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.