Chapter 23. Balance the Data and the Context

We can broadly subdivide the graphical elements in any visualization into elements that represent data and elements that do not. The former are elements such as the points in a scatterplot, the bars in a histogram or bar plot, or the shaded areas in a heatmap. The latter are elements such as plot axes, axis ticks and labels, axis titles, legends, and plot annotations. These elements generally provide context for the data and/or visual structure to the plot. When designing a plot, it can be helpful to think about the amount of ink (Chapter 17) used to represent the data and context. A common recommendation is to reduce the amount of non-data ink, and following this advice can often yield less cluttered and more elegant visualizations. At the same time, context and visual structure are important, and overly minimizing the plot elements that provide them can result in figures that are difficult to read, confusing, or simply not that compelling.

Providing the Appropriate Amount of Context

The idea that distinguishing between data and non-data ink may be useful was popularized by Edward Tufte in his book The Visual Display of Quantitative Information [Tufte 2001]. Tufte introduces the concept of the “data–ink ratio,” which he defines as the “proportion of a graphic’s ink devoted to the non-redundant display of data information.” He then writes (emphasis mine):

Maximize the data–ink ratio, within reason.

I have emphasized the phrase “within ...

Get Fundamentals of Data Visualization now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.