Comparative visualizations

Let's suppose we'd like to compare the distributions of electorate data between the UK and Russia. We've already seen in this chapter how to make use of CDFs and box plots, so let's investigate an alternative that's similar to a histogram.

We could try and plot both datasets on a histogram but this would be a bad idea. We wouldn't be able to interpret the results for two reasons:

  • The sizes of the voting districts, and therefore the means of the distributions, are very different
  • The number of voting districts overall is so different, so the histograms bars will have different heights

An alternative to the histogram that addresses both of these issues is the probability mass function (PMF).

Probability mass functions

The probability ...

Get Clojure for Data Science now with the O’Reilly learning platform.

O’Reilly members experience live online training, plus books, videos, and digital content from nearly 200 publishers.