October 2017
Beginner to intermediate
270 pages
7h
English
This technique aims to give the dataset the properties of a normal distribution, that is, a mean of 0 and a standard deviation of 1.
The way to obtain these properties is by calculating the so-called z scores, based on the dataset samples, with the following formula:

Let's visualize and practice this new concept with the help of scikit-learn, reading a file from the MPG dataset, which contains city-cycle fuel consumption in miles per gallon, based on the following features: mpg, cylinders, displacement, horsepower, weight, acceleration, model year, origin, and car name.
from sklearn import preprocessingimport ...
Read now
Unlock full access