The importance of visualizations

Simple visualizations like those earlier are succinct ways of conveying a large quantity of information. They complement the summary statistics we calculated earlier in the chapter, and it's important that we use them. Statistics such as the mean and standard deviation necessarily conceal a lot of information as they reduce a sequence down to just a single number.

The statistician Francis Anscombe devised a collection of four scatter plots, known as Anscombe's Quartet, that have nearly identical statistical properties (including the mean, variance, and standard deviation). In spite of this, it's visually apparent that the distribution of xs and ys are all very different:

Datasets don't have to be contrived to reveal ...

Get Clojure for Data Science now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.