Chapter 4. Visualizing Data Differently

If you choose to just use tables, bar charts, and line charts, you will be able to fulfill most data communication needs. However, by using only these basic forms of communicating with data, you may restrict your analysis and risk boring your audience.

Using alternate chart types can help you find different messages in the data. Using two measures on a chart instead of one can show relationships you would not see otherwise. Comparing one metric directly to another means that you don’t have to look at two separate charts and form the analysis in your head. And showing the individual data points, instead of aggregating values to show a summary metric, can uncover new trends in the data.

This chapter looks at some alternate charts and ways to use them.

Chart Types: Scatterplots

I’m going to have to mention this at the outset: I love scatterplots. There, I’ve said it. Of course, I’ll give you an unbiased opinion, but I will also share why I think they are so powerful.

I love scatterplots because of their flexibility; they can cover several use cases. Many people also find them easy to interpret. The combination of multiple metrics is useful for analysis. Finally, scatterplots allow you to combine hundreds, if not thousands, of data points on a single chart, which can uncover stories in the data that might be lost if you filtered the data to fit on a single page. (Color can help here, highlighting the key data points.)

With so many options, let’s ensure that you understand the fundamental building blocks of scatterplots.

How to Read Scatterplots

You can add a lot of detail to a scatterplot, but that doesn’t mean you should. Too much detail can make the chart difficult to read.

We’ll begin by looking at a simple scatterplot from our bike shop, Allchains. This scatterplot compares the sales value to profit for each of our bike types (Figure 4-1).

Let’s explore the elements of a scatterplot: multiple axes, plots, color, and shapes. We have lots of choices to make within each one.

Multiple axes

Scatterplots have two axes, rather than the singular axis we have seen on charts thus far (Figure 4-2). This is useful when you want to directly compare two metrics.

The axes create a 2D position against which you can compare the data point. By plotting multiple points, you will be able to find and analyze patterns among them. Also, the measure forming the x-axis should be the independent variable: the measure that is not reliant or driven by the y-axis. The y-axis’s measure is therefore described as the dependent variable. In Figure 4-2, the sales value is plotted on the x-axis, since without any sales, no profit could be generated: profit is dependent on sales.

The patterns created by these plots are classified as correlation patterns (Figure 4-3). You may have heard of the false cause fallacy, or “correlation doesn’t equal causation.” It means that just because you find a strong correlation between two factors in your data, you can’t assume that one factor is causing the other.

In this example, Allchains sells more bike helmets on sunny days. Can we assume that sunny days cause more sales of bike gear? Not necessarily. Personally, I ride my bike a lot more on sunnier days than on rainy ones—and most of those sunny days occur in summer. If more helmets are sold on sunny days, it’s probably due to the overall warmer seasonal weather of summer, not the sunshine itself. After all, winter days can be sunny and icy at the same time, but I’m not going riding on those days!

Correlations can be grouped into numerous types; the main terms you will come across are positive and negative correlations and strong and weak correlations. In a positive correlation, as the measure forming the x-axis increases, so will the measure on the y-axis (Figure 4-4). We can demonstrate these with a trend line on our scatterplot. In Figure 4-4, I’ve used orange to make the trend line really pop.

If the dependent variable reduces as the independent variable increases, you have a negative correlation (Figure 4-5). For example, if X is the number of times Allchains provides maintenance services to bikes, Y shows a reduction in the number of mechanical breakdowns for our customers in the following year.

However, just being aware of the direction of the correlation isn’t enough. How much attention you should pay to the relationship you have found depends on the strength of the relationship between the variables. A strong correlation means the data points are tightly packed around the trend line (Figure 4-6). The less distance between the data points and the line, the stronger the relationship is.

The farther the data points are from the trend line, the weaker the relationship is (Figure 4-7).

Not every scatterplot will show a correlation. If no relationship exists between the measure on the x-axis and the measure on the y-axis, the scatterplot has no correlation. That might look something like Figure 4-8.

Whether you draw the trend line or not, showing the patterns in scatterplots can be easier than explaining the relationship through words or other chart choices. Once you see the pattern in the data, it also becomes easier to spot the outliers, the data points that don’t fit the pattern you’ve established. Investigating outliers can reveal issues in your organization that wouldn’t be apparent otherwise.

Plots

The superstars of the scatterplot are the actual data points. A plot, or a point on the scatterplot, represents two data points, one from the measure forming the x-axis and one from the y-axis (x, y).

When you have too few data points, as in Figure 4-1, drawing anything useful from the chart can be hard. The converse is overplotting: having so many data points makes it difficult to see what the chart is showing. Figure 4-9 is an example: it shows sales value and profit data from about 800 bike sales.

An example of overplotting on a scatterplot

Can you identify 800 distinct plots here? I can’t. Many of the plots are right on top of each other. This technique helps when only a few plots are overlapping each other. In Figure 4-9, though, the darkly shaded area is an amorphous mass of indistinguishable plots. This chart is not completely useless, however, since it shows the outliers.

If the question you are trying to answer requires individual data points, like analyzing all students in a school, you can adjust the chart style to help. By increasing the transparency of the plots, you can see where the overlapping points exist more clearly. In Figure 4-10, I’ve reduced the same plots to 30% of their original opacity.

Increasing the transparency of the plots

Another technique to break up the amorphous blob is to add borders to the plots, to show the number of data points at least on the surface. In Figure 4-11, I have used a light-gray border to make the individual points “pop” off the page when they overlap.

Sometimes it’s difficult to get everything you need into a single, static chart. We’ll explore this in Chapter 7 when we look at using multiple charts to show various aspects of the data rather than trying to squeeze all of them onto just one.

Color

One thing you may have noticed about our scatterplots so far is that it is difficult to see which point relates to what categorical value. The plots are often categorical values, like the headers on bar charts. Figure 4-12 adds color and a color legend (the small reference on the side of the chart that explains what each color represents).

Be careful not to overuse color on scatterplots: your audience probably won’t remember what each of 20 colors represents, and forcing them to look back and forth to the legend too much adds more cognitive effort to understand your communication. As discussed in Chapter 1, one of our focuses is to reduce the cognitive effort required to understand the message you are sharing.

Most cultures already associate many meanings with colors, and you can use this to your advantage. If you use colors in ways that are already linked to familiar concepts, the audience will need to refer to the legend a lot less. If, for example, you are visualizing the sales of fruits and vegetables for a grocery store, using the hues related to the foods—such as red for strawberries and yellow for bananas—will make it easier to read. Using red for bananas and yellow for strawberries, on the other hand, would add to the cognitive load. Similarly, you might use black and red to indicate profit and loss, since “in the red” is a common idiom for loss-making companies, and “in the black” describes profitable ones. Wherever you can use the consumer’s awareness of such factors, do so: it reduces the cognitive load. The term for this is using your audience’s psychological schema.¹

In Figure 4-12, I’ve intentionally used colors that look like mud for mountain bikes, stone for gravel bikes, and gray for road bikes. Using individual colors like this to represent categories is known as a categorical color palette.

If your plots represent an ordinal data field, you may wish to use a sequential color palette. This uses grades of shading of a single color, from light to dark, to represent a sequence of values (such as low to high or early to late). With 16 data points in Figure 4-13, it would be difficult to see whether later quarters have had higher sales and profits than earlier quarters. With a sequential color palette to indicate when in the year the sale occurred, it is at least possible to draw some conclusions from this chart. In this case, plots of higher sales and profits are all darker blues, showing they happened more recently.

Another palette type you can use is a diverging color palette, which uses two colors to represent values that cross above or below a certain threshold, such as zero or a target. One color could represent underperformance, and another color could represent overperformance.

Finally, you can use color to make certain points stand out among all the others. In Figure 4-14, I have highlighted my own purchases at Allchains amid those of hundreds of other customers.

This is a simple technique that shares the message without losing the context of all other customers’ behavior. Chapter 7 covers more about color.

Shapes

The plots on your scatterplot don’t have to be circles. You can use shapes to represent categories, as shown in Figure 4-15.

Shape scatterplots are particularly useful for ensuring accessibility. You don’t always know if all of your consumers can easily distinguish colors. What’s commonly called color blindness is an inability to differentiate part of the color spectrum, and it can manifest differently in many visual disabilities.

Trade-offs exist here: shape is a pre-attentive attribute, just as color is, but color triggers pre-attentive responses more strongly. Interpreting shapes takes more cognitive work. To make this easier, you might use representative shapes where possible, or pair shapes with color. Chapter 5 discusses shapes further.

How to Optimize Scatterplots

Scatterplots are a good chart option whenever you are comparing two measures, especially when one measure has (or might have) an impact on the other. Think of the sales and profit measures used throughout this chapter. As sales increase, you’d expect profits to increase, right? But that might not be the case! What if sales increase as our company lowers prices to undercut the competition? Or the cost of each sale might rise, forcing the company to spend more than usual to keep up with production volumes through extra sales.

The scatterplot may not always be able to tell you why something is happening, but it will nudge you in the right direction and make you ask the right questions. A few variants of scatterplots, discussed next, can prove useful in certain situations.

Small multiple scatterplots

As seen in “Multiple axes”, using trend lines in scatterplots can be a strong technique to communicate the relationship between two metrics. However, too many plots on a single scatterplot can hide significant or changing trends. One workaround is to break the single scatterplot into many scatterplots. You can shrink the charts and change the formatting to convey the message on a single page or screen.

The term small multiples refers to the trellis-like pattern of charts that is created when each chart is subdivided into categories. Small multiples can be formed from most forms of charts, but I find scatterplots particularly effective. In Figure 4-16, I have broken up a scatterplot by year (vertically) and quarter (horizontally) to compare quarterly trends clearly against each other. I also made formatting alterations to make the trend the clearest part of the chart. Highlighting the trend in color against a strong x- and y-axis makes the trends quickly comparable. The plots have had their transparency increased to still be visible but fade into the background.

In Figure 4-16, you can quickly see the negative correlation between sales and profit in Q1 2017: it is the only trend line that tracks downward as sales increase. The trend lines demonstrate that the most profit for sales occurred in Q1 2020, and this message is clearly shown by the small multiple scatterplot.

This technique is particularly useful when sharing static versions of the chart. However, even if you make an interactive version of your scatterplot that includes filtering to create each individual small multiple in turn, you may still want to consider using the small multiple option. The trellis shape of small multiples allow you to compare trends horizontally—in this case, quarter-on-quarter, and the same quarter in a different year.

Quadrant charts

Just like the small multiple scatterplot makes trends more apparent, a quadrant chart also simplifies the interpretation of the data in the scatterplot. Quadrant charts effectively dissect the scatterplot with reference lines linked to the axes. This clarity makes it much easier to determine next steps.

Take the scatterplot in Figure 4-17: with a weak correlation, how do you interpret the message in this chart? The x-axis shows sales, the y-axis represents profit, and each plot is a different category of each bike type.

It’s difficult to see much in this scatterplot, as the data has very little grouping. Grouping is another pre-attentive attribute that helps your audience understand the messages in scatterplots.

You can add an average line of the mean for each metric for easier analysis. Figure 4-18 shows how using two average lines can divide the plots, creating a quadrant chart.

The quadrant chart’s sections can now be easily described, allowing the reader to see which decisions might be made about each point. For example, the plots in the High Sales, High Profit section are very important for the store: they are generating high cash flow while still making money for the stores.

The Low Sales, High Profit section represents an opportunity for the business, by allowing us to understand why we’ve been able to generate such high value from such a meager amount of sales. If the company was able to sell more, would the profit increase in equal proportion, or would the sale price have to fall, eating into those profit margins, to sell more?

The High Sales, Low Profit section poses an interesting challenge: these bike types are selling well, yet the company can’t seem to generate profit from them. This is a drain on resources. Should Allchains stop selling bikes in these categories and focus on other types?

The Low Sales, Low Profit section should be monitored to determine whether there is any chance for growth or whether it’s time to stop selling these items.

Quadrant charts are useful for showing the data points clearly while also simplifying the analysis. They are particularly useful for audiences that are not used to using scatterplots to interpret data.

When to Avoid Scatterplots

Sometimes scatterplots make the message harder to understand. You might see these used often, but I recommend staying away if too many colors would be required or if you need to add a third measure. Let me show you why.

Too many colors

In the words of my colleague Luke Stoughton, using too many colors on a scatterplot can look like you’ve “squashed a unicorn.” It’s hard to disagree with him when I’ve seen too many charts that look like Figure 4-19.

A potential alternative is interactive charting. With interactive charting, the user can instead hover over each plot to see what it represents—so you don’t need the splatter of unicorn colors. (The challenges of interactivity are discussed more deeply in Chapter 8.) To mitigate this issue, it is much easier to highlight just a single plot, or at worst a few key points to highlight, as shown previously in Figure 4-14.

Nondifferentiable color palettes

Scatterplots are so effective at showing two measures that you might be tempted to add a third, to demonstrate an additional relationship in the data. Figure 4-20 adds a new dimension, average discount, to the plots used as the base for the quadrant chart in Figure 4-17.

Scatterplot with sequential color palette

No, it’s not your eyes—it’s just tough to distinguish the average discounts shown by the blue gradient in the sequential color palette. You can probably spot the highest average discount, but trying to separate the lower third of the points is difficult. This chart would be much better if Discount was added as a set of bands, to allow the user to draw clearer distinctions among the levels of discount (Figure 4-21).

When users have to pick out only a few shades of the same color, it is much easier for them to form a relationship between color and meaning. In addition, to clarify the relationship between the two metrics shown as the axes of the scatterplot, each axis should be the same length. Any distortion of their length can change how the relationships and correlations are perceived.

Again, don’t try to squeeze too much into a single chart. If you find yourself struggling to see the colors clearly, try creating a separate chart instead, or consider using interactive charting.

Chart Types: Maps

Maps grab readers’ attention. Children are taught how to read maps from an early age, so they’re usually a familiar form of data communication, which can make absorbing the message much simpler. This section presents a few key aspects of visualizing data with maps, including how to determine whether a map is your best option.

How to Read Maps

If you really think about it, maps are a form of scatterplot. Think of longitude and latitude as the x-axis and y-axis of a map, respectively.

Understanding this allows us to take advantage of a pre-attentive attribute we looked at Chapter 1: grouping. A cluster of points on a map, such as incidences of natural events like meteor strikes, can show areas of activity; the absence of points then shows a lack of the same activity.

If your data shows human activity, though, you will frequently find data points clustering in population-dense areas, like major cities, as Figure 4-22 shows. In these cases, clustering can obscure the stories in your data.

Figure 4-22 is a symbol map: a symbol (in this case, a circle) is placed on the map to represent the data point for that location.

Symbol map showing sales by city from our bike stores across the United States

Size and shape

Data is visualized in a symbol map by sizing the shape to represent the values of the measure; the larger the shape, the higher the value. This makes it easy to see the largest values, but the lowest values, being small, often fade into the background. If you need to identify low values (such as markets with underperforming sales), this can be a problem. Symbol maps are great when you need to show the reader the range of values quickly, but since readers can’t measure the precise size of the shape, these maps aren’t good for showing exact differences.

Here’s another potential problem with symbol maps. The clusters in the top-right corner of the map in Figure 4-22 make it look like sales are especially high in the Northeastern US. In reality, many major cities are much closer together in that area than in other parts of the US, skewing the display.

Symbol maps can use any shape to represent the data point. With circles, the center of the shape often represents the location of the data point. However, Google’s inverted drip shape (discussed more in Chapter 5 with Figure 5-21) uses the point at the bottom of the shape to indicate a precise location. Make sure the shape you choose clearly demonstrates the location.

Choropleth maps and color

You can also use color on a symbol map, but I recommend giving it a different meaning from the shape. Using two forms of pre-attentive attributes for the same information, such as both color and size for the same aggregation of the same measure, is called double encoding. It can hide other stories within the data by overexaggerating the main message and is best avoided.

Sequential or diverging palettes are frequently used with maps to show how a range of values corresponds with the shape of a geographical element. These maps are called choropleth maps. Figure 4-23 uses data that’s similar to the data in Figure 4-22, this time at the state level rather than city level, but the resulting effect is very different. Here, greater values are shown as more intensely colored. However, as with the symbol map, trying to distinguish between anything but the highest and lowest values in a choropleth map is challenging.

How to Optimize Maps

You might have noticed that the maps I have used so far all have a minimal background. Removing as much unnecessary detail as possible allows the data to stand out. Remember, your data visualization is the primary purpose of the map. Think carefully about adding and removing roads, rivers, or borders, based on the purpose of the visualization. If you strike the correct balance between background and data, your audience will get a clear view of the data points as well as of their geographical context.

As you saw with shapes, with choropleth maps, variation in the size of the mark can affect how your message is perceived. Small locations, like the states in Figure 4-23, are hard to see; large areas are likely to draw your audience’s attention even if they are not the intended focus. Take Figure 4-24, which shows bike saddle sales for each state east of the Mississippi.

Bike accessory sales shown by a choropleth map

Quick, which state sells the most saddles?

Could you tell that it’s Rhode Island (RI)? What, you mean you can’t? You’re not alone. I think most people would struggle to draw that conclusion from this map, since Rhode Island is so small. Your eyes are likely drawn to the larger states, since they are bigger blocks of the same color.

How can we fix this? Visualizing the same data as a symbol map instead makes even the smallest state stand out (Figure 4-25). The symbols in Figure 4-25 have to remain rather small so they don’t overlap each other and hide any smaller symbols behind. Tile maps might be a better approach in this situation.

Tile maps

Tile maps offer equal space for each entity (in this case, each state) but in a layout similar to a regular map: for example, Maine is still at the top near Vermont and New Hampshire. Figure 4-26 shows the profit for Allchains bike stores in each state.

Data thresholds

Choropleth maps can be more useful than symbol maps when you want to visualize data that crosses a threshold, like zero or a target. Being able to see what falls above and below the threshold is likely to be the key aspect of the visualization. The shapes of a symbol map are sized on a linear scale. When data goes past the threshold tipping point, like zero, it becomes difficult to make that linear scale make sense.

Take, for example, profit and loss for Allchains stores in each state. We have three ways to visualize profit and loss by using the size of symbols (Figure 4-27):

Small symbols represent the most negative values; large symbols represent the most positive values.
Large symbols represent the most negative values; small symbols represent the most positive values.
Large symbols represent the most negative values, tapering to small as the values cross the zero point; the symbols then become larger along with the positive values.

None of these three options is very effective.

Option 1 in Figure 4-27 could hide the largest negative values. The most profitable items would dominate the map, but the items making the largest loss wouldn’t be visible. This might a great choice if you wanted to put a positive spin on the numbers, but it wouldn’t be a clear representation of the data.

If you reverse the sizing from largest to smallest as the values go from the largest negative number to the largest positive, as in option 2, you give the opposite impression. Neither helps the audience identify the biggest winners and losers to make a balanced judgment.

Option 3 creates that balance, but it’s completely confusing: here, size does not tell the reader whether a number is positive or negative. You could add color to show whether the value is positive or negative, but that would be double encoding.

A choropleth chart would be much more effective at highlighting the largest positive and negative values. Figure 4-28 uses a diverging color palette to differentiate negative and positive values.

Choropleth map using a diverging color scale to represent state profit

The darker or more intense the color of each state, the more significant the profit or loss. In Figure 4-28, your eyes can easily find the highest profits (in black) or the largest losses (in red, to use the audience’s psychological schema for accounting colors), on the same chart. You can see that no states have losses to the same extent as others have profits.

Density and hex bin maps

As internet-connected devices and trackers create ever-larger geographical data sets, a common mapping challenge is to visualize many thousands of data points on the same map. This brings us back to the problem of overplotting, as discussed in “How to Read Scatterplots”.

Let’s look at taxi-journey data from New York City. If we were trying to work out where to open an Allchains store, we might look for places where we know lots of people are starting journeys and offer an alternative transportation option. But in Manhattan, taxis are so common that there are nearly 800,000 data points. On the map in Figure 4-29, even if each data point is shrunk to a dot, they cluster into a mass the shape of Manhattan.

Map of hundreds of thousands of taxi journey starting points in Manhattan

Two alternative map types can assist us in solving the dilemma of overplotting. The first is a density map, which accounts for plots that are close to or on top of each other. Density maps use a sequential color palette: the higher the number of plots, the lighter and brighter the color.

In Figure 4-30, the density map shows a higher level of activity in midtown Manhattan. Lower-value plots are blurred out almost entirely, as on the northern tip of the island. This data story was also present in Figure 4-29, but the style of map made it impossible to see.

Another alternative is a hex bin map. The same Manhattan taxi-journey data is shown as a hex bin map in Figure 4-31. This style of map counts the number of points found in a certain area. Those areas are often shown as hexagons that tessellate closely together, like a honeycomb. A sequential color palette shows the range of values captured in each area, with darker colors representing the highest values.

The density map and the hex bin map tell a similar story: they both suggest locating the store in Midtown, somewhere between 30th and 54th Streets. With the hex bin map, though, it’s a little easier to identify more precisely where the bike store location should be.

Lots of map styles are available to choose from, but depending on the message you are conveying, the amount of data you have, and the scale of the geographical areas, some styles are more useful than others.

When to Avoid Maps

Sometimes you should shy away from certain styles of maps, but other times maps just aren’t the answer. Let’s look at a few such situations.

If you are analyzing data that contains geographic fields, don’t assume that you necessarily need a map. Let’s go back to the Allchains accessories sales shown in Figure 4-25. What if the data is converted to a rank, with 1 indicating the highest sales. How would multiple ranks for different products be shown? The original data set has three values for each state, showing how each ranks in terms of three products.

Would three maps be the best way to show this data? Certainly not: that would take up a lot of space, unless you want to make each state tiny. This option would also require the audience to remember the rank of each state to compare the variances.

Instead, you could use a parallel coordinates plot to show the change in rank among the various measures (Figure 4-32).

In a parallel coordinates chart, the rank of a categorical member (in this case, state) determines where the mark is made against a vertical axis. The left-to-right flow of the chart in Figure 4-32 shows changes in rank for various products. (If the change is shown over time, the chart is called a bump chart.) In this example, I’ve added a highlight to show that Rhode Island is ranked first in two categories of accessories but not for pedals. The lines connecting the circles representing each state can show changes among the categories. A steep rise or fall is a strong indication of change in the rank, drawing your attention more than a change in color saturation on a map ever would.

When you have multiple measures or categories, it’s tempting to try to squeeze too much onto a single map. Figure 4-33 demonstrates how confusing multiple measures can be on a map.

A parallel coordinates chart as an alternative to a map

This map isn’t impossible to read, but it isn’t easy. Including two metrics forces us to use two mark types: profit as the choropleth and total sales as sized shapes. The message in Figure 4-33 is not clear. As an alternative, a scatterplot is a great method to communicate two measures split up by a category (Figure 4-34).

Scatterplot showing sales compared to profit for each state

Occasionally, you might need to use multiple categories as well as multiple measures. I’ve seen too many maps like Figure 4-35, with multiple chart types layered on top of the base map. This might seem extreme since the chart types used together are so different, but this kind of juxtaposition is common. Resist the temptation!

In Chapter 7, I will demonstrate why it is much easier to create multiple charts than to encode too much information into one chart.

Chart Types: Part-to-Whole

Anytime you visualize a total value, people will ask you how that value breaks down: what are its constituent parts? The breakdown of the value will be a categorical data field, which can be a challenge to visualize, especially in a static form. We have multiple part-to-whole chart types to choose from (including bar charts), but this section looks at two of the most common: pie charts and treemaps.

How to Read Part-to-Whole Charts

Like maps, pie charts are covered early in most schoolchildren’s education and are common in news media, so audiences find them familiar.

Sections

The circle, or pie, represents the total of the measure being analyzed. A category’s individual contribution to the overall measure is demonstrated by the colored-in section of the circle. In Figure 4-36, wheel sales at Allchains, represented by purple, makes up a quarter of the overall amount, so a quarter of the circle is colored purple. All of the other categories have been combined to form the Everything Else group.

If you have more than two sections, the largest section should start at the top of the circle unless the other section is the grouping of all other categorical variables. Assume that the reader’s eye will rotate clockwise.

Additional categories follow clockwise from the end of the initial section. In Figure 4-37, brake sales make up an eighth of the overall total, so the colored-in section covers 12.5% of the circle. The highlighted categories should be shown in order from highest to lowest value, for easier interpretation.

Basic pie chart with additional category

Angles

Pie charts are all about angles—and you’ll notice angles don’t appear in the list of pre-attentive attributes. Size does appear in that list, however, and that is what we are comparing when we look at various sections of the pie chart. Humans aren’t great at assessing angles precisely, but that doesn’t make pie charts impossible to read. Learning to read analog clock faces from an early age helps. I’ve found people can visually determine a quarter, half, or three-quarters of a circle. Starting that section at the top point of the circle makes it easier still to recognize, as in Figure 4-38.

When those sections don’t start at the top of the circle and are offset by another category, they become much harder to interpret. For example, in Figure 4-39, wheel sales is the same size as in Figures 4-36 and 4-37 but in a different position. If I hadn’t told you it was the same size, would you have been sure?

Offset sections making pie charts harder to read

Labels

One element that is more frequently shown on a pie chart than others we have featured so far is labels. The labels can show the name of the category, the value, and/or the percentage of the total represented by the section (Figure 4-40).

Labels can help the user more precisely interpret the values being shown. However, take care to avoid having your audience see the chart as secondary to the label.

Donut charts

Another variant of the pie chart, often seen in news media, is called a donut chart—named for the hole in the middle (Figure 4-41).

Donut charts offer more whitespace, which you know is important when designing communications. However, the missing middle section can make it slightly harder to determine the angle of the data section, and therefore the value it represents.

Treemaps

Instead of using angles to represent values, treemaps use area, shown as a rectangle. The treemap in Figure 4-42 shows the same values as the first pie chart in this section (Figure 4-36).

Researchers debate what is easier to interpret, but personally I find it easier to interpret an area with squares or rectangles than with the angles or circular sections of pie charts.

Labels are helpful for donut charts, especially if one section is being highlighted. You can get creative with the blank middle of the donut. You might use it to show the value of the highlighted section and any other information you’d like to share about it (Figure 4-43). You could add small percentage change indicators or even sparklines (covered in Chapter 3) to give additional context.

In a treemap, you can place labels on top of the sections representing each categorical member (Figure 4-44). If the area of the treemap section is too small for a label, that section probably doesn’t warrant the attention the label would draw.

Treemap with multiple sections and labels

When to Use Part-to-Whole Charts

Pie charts work well only when you have few categorical variables. Two variables are ideal. When visualizing the sales of the road bike type, for example, I’ve chosen to group the other bike types’ sales to simplify the view for the reader (Figure 4-45).

The message is much clearer than it would be if I showed every bike type, even with labels (Figure 4-46). The other sections detract focus from the message, which is about the percentage of road bike sales.

When using multiple segments, treemaps offer more space for labels and make it easier to compare sections with similar values (Figure 4-47).

I’ve also found treemaps particularly useful when showing long-tailed distributions of data, in which each category has lots of small contributions. For example, if you sell a large range of products, it can be useful to compare the value of sales from each of the products. In Figure 4-48, I’ve broken down each bike type by manufacturers, showing all of the brands sold through the bike store over time. This creates a lot of subdivisions, but you can still draw conclusions: for example, the top five manufacturers of gravel bikes make up about half the sales of that bike type.

Most business intelligence tools used to build treemaps will automatically present the largest value at the top left, so it is easier to rank the sales visually and see how many values it takes to make up significant proportions of the overall or segment value.

Treemap showing long-tailed distribution

When to Avoid Part-to-Whole Charts

To use part-to-whole charts, you need a whole. If the chart doesn’t visualize the total amount of the value, this is not the right chart type to use. In Figure 4-49, the gravel bike type has been removed. Depending on the title, this chart might lead you to believe that the stores sell only two bike types.

Survey results are often displayed in pie charts, but this can get complicated. If the survey allows respondents to give multiple answers, for example, the relationship is not one of separate, nonoverlapping parts to a whole, and doesn’t add up to 100%. A pie chart is likely to be misleading.

In addition, you can’t visualize the total amount in a pie chart or treemap—even if you include all the potential categories—if any members of the category have negative values. There is no clear way to visualize a negative contribution as a proportion of an area.

Finally, avoid part-to-whole charts when demonstrating change over time. If you want to show how proportions of bike sales change by type over time, and you’ve already made a pie chart for a single year, you might be tempted to replicate a pie chart per year. In Figure 4-50, because of the changing proportions of each bike type, it is challenging to see the change in proportion of sales over time.

Pie charts demonstrating change over time

Using pie charts to show change over time can also hide the absolute change in the overall amount the pie chart represents. It takes a lot of labeling to make pie charts communicate this information clearly.

A line chart would be a much clearer way to communicate the change in percentage of total sales each bike type achieved each year. Figure 4-51 shows exactly this relationship, but it’s much easier to see the changing patterns across the years. In the pie charts, the mountain bike type angle didn’t have a consistent starting point.

Too many categorical variables will make any pie chart difficult to read. The same detail that works well in the treemap in Figure 4-48 becomes unreadable in pie chart form, as seen in Figure 4-52.

Finally, you should not use part-to-whole charts to show any measure that may go beyond 100%, like progress toward (and hopefully beyond) a sales target.

Summary

In language, the more words you know, the more options you have when making your point. In data visualization, chart types are your vocabulary.

While less-common charts do gain the attention of the audience because of their unique aesthetics, they are also more challenging to interpret, since they’re less familiar and don’t always use pre-attentive attributes as effectively.

This chapter has covered just a small portion of the alternate chart types available. Once you’ve grasped the basics, you can explore even more. The primary reason to use alternate chart types is that they’re eye-catching. You’ve seen throughout this book that a big part of the battle is making your data visualizations stand out to audiences’ eyes and in their memories.

Figure 4-53 shows a visualization inspired by my colleague Joe Kernaghan that offers an alternate way to show a company’s income statement. This chart, called a Sankey chart, shows the various profit types included in Tesla’s 2020 financial statements.

The chart doesn’t offer precise information but does show how the various amounts fit together. It also educates readers about how these amounts form the company’s gross and operating profit. Its unusual shape also grabs people’s attention, so it works well as a chart.

When you go beyond basic bar charts and start exploring the wide variety of chart types, you’ll make active choices about how you communicate different types of information. The more experience you gain in making those choices, the better your visualizations will be.

Sankey chart for TSLA 2020 income statement (based on a template from the Flerlage Twins)

¹ Ryan Sleeper, Practical Tableau (Sebastopol, CA: O’Reilly, 2018), 495.

Get Communicating with Data now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Chapter 4. Visualizing Data Differently

Chart Types: Scatterplots

How to Read Scatterplots

Figure 4-1. Scatterplot

Multiple axes

Figure 4-2. Multiple axes in a scatterplot

Figure 4-3. Correlation not equaling causation

Figure 4-4. Scatterplot with a positive correlation

Figure 4-5. Scatterplot with a negative correlation

Figure 4-6. Scatterplot with a strong correlation

Figure 4-7. Scatterplot with a weak correlation

Figure 4-8. Scatterplot with no correlation

Plots

Figure 4-9. An example of overplotting on a scatterplot

Figure 4-10. Increasing the transparency of the plots

Figure 4-11. Increased transparency with borders

Color

Figure 4-12. Colored scatterplot

Figure 4-13. Sequentially colored scatterplot

Figure 4-14. Color used to highlight

Shapes

Figure 4-15. Shape scatterplot

How to Optimize Scatterplots

Small multiple scatterplots

Figure 4-16. Small multiple scatterplots

Quadrant charts

Figure 4-17. Scatterplot to form quadrant chart

Figure 4-18. Quadrant chart

When to Avoid Scatterplots

Too many colors

Figure 4-19. Scatterplot with too many colors

Nondifferentiable color palettes

Figure 4-20. Scatterplot with sequential color palette

Figure 4-21. Scatterplot with banded color

Chart Types: Maps

How to Read Maps

Figure 4-22. Symbol map showing sales by city from our bike stores across the United States

Size and shape

Choropleth maps and color

Figure 4-23. Choropleth map

How to Optimize Maps

Figure 4-24. Bike accessory sales shown by a choropleth map

Figure 4-25. Better symbol map

Tile maps

Figure 4-26. Tile map of profit by state

Data thresholds

Figure 4-27. The effect of a scale crossing zero

Figure 4-28. Choropleth map using a diverging color scale to represent state profit

Density and hex bin maps

Figure 4-29. Map of hundreds of thousands of taxi-journey starting points in Manhattan

Figure 4-30. Density map using the same data as in Figure 4-29

Figure 4-31. Hex bin map using the same data as in Figure 4-29

When to Avoid Maps

Figure 4-32. A parallel coordinates chart as an alternative to a map

Figure 4-33. Map showing multiple measures

Figure 4-34. Scatterplot showing sales compared to profit for each state

Figure 4-35. Pie chart and choropleth map

Chart Types: Part-to-Whole

How to Read Part-to-Whole Charts

Sections

Figure 4-36. Basic pie chart sections

Figure 4-37. Basic pie chart with additional category

Angles

Figure 4-38. Reading pie chart angles

Figure 4-39. Offset sections making pie charts harder to read

Labels

Figure 4-40. Pie chart with labels

Donut charts

Figure 4-41. Donut chart

Treemaps

Figure 4-42. Basic treemap

Figure 4-43. Donut chart with labels

Figure 4-44. Treemap with multiple sections and labels

When to Use Part-to-Whole Charts

Figure 4-45. Simple donut chart example

Figure 4-46. Donut chart with multiple segments

Figure 4-47. Basic treemap with multiple segments

Figure 4-48. Treemap showing long-tailed distribution

When to Avoid Part-to-Whole Charts

Figure 4-49. Pie chart not showing the total sales