O'Reilly logo

Machine Learning with Swift by Alexander Sosnovshchenko

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Exploratory data analysis

First, we want to see how many individuals of each class we have. This is important, because if the class distribution is very imbalanced (like 1 to 100, for example), we will have problems training our classification models. You can get data frame columns via the dot notation. For example, df.label will return you the label column as a new data frame. The data frame class has all kinds of useful methods for calculating the summary statistics. The value_counts() method returns the counts of each element type in the data frame:

In []: 
df.label.value_counts() 
Out[]: 
platyhog       520 
rabbosaurus    480 
Name: label, dtype: int64 

The class distribution looks okay for our purposes. Now let's explore the features.

We need to ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required