Profiling the data

To make sure that we create a model, or at least process, that we understand, and to make sure that we can mentally check our results, we need to start every machine learning model building process with data profiling. We need to gain an understanding of how each of our variables are distributed and their range and variability.

To do this, we will calculate the summary statistics that we discussed earlier in Chapter 2, Matrices, Probability, and Statistics. Here, we will utilize a method built into the github.com/kniren/gota/dataframe package to calculate our summary statistics for all of the columns of our dataset in one operation:

// Open the CSV file. advertFile, err := os.Open("Advertising.csv") if err != nil { log.Fatal(err) ...

Get Machine Learning With Go now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.