Using simple statistics to better understand our data

Now that it's clear how the data is structured and what is contained in the collection, we can get a better understanding by looking at some basic stats.

To get us started, let's invoke the describe function:

julia> describe(iris)

The output is as follows:

This function summarizes the columns of the iris DataFrame. If the columns contain numerical data (such as SepalLength), it will compute the minimum, median, mean, and maximum. The number of missing and unique values is also included. The last column reports the type of data stored in the row.

A few other stats are available, including ...

Get Julia 1.0 Programming Complete Reference Guide now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.