May 2019
Beginner to intermediate
466 pages
10h 44m
English
Let's get a feel of the data, starting with the users:
julia> using DataFrames julia> describe(users, stats = [:min, :max, :nmissing, :nunique, :eltype])
The output is as follows:

We chose a few key stats—the minimum and maximum values, the number of missing and unique values, and the type of data. Unsurprisingly, the User-ID column, which is the table's primary key, starts at 1 and goes all the way up to 278858 with no missing values. However, the Age column shows a clear sign of data errors—the maximum age is 244 years! Let's see what we have there by plotting the data with Gadfly:
julia> using Gadfly julia> ...
Read now
Unlock full access