We deal with a lot of raw data in the real world. Machine learning algorithms expect data to be formatted in a certain way before they start the training process. In order to prepare the data for ingestion by machine learning algorithms, we have to preprocess it and convert it into the right format. Let's see how to do it.
Create a new Python file and import the following packages:
import numpy as np from sklearn import preprocessing
Let's define some sample data:
input_data = np.array([[5.1, -2.9, 3.3], [-1.2, 7.8, -6.1], [3.9, 0.4, 2.1], [7.3, -9.9, -4.5]])
We will be talking about several different preprocessing techniques. Let's start with binarization:
Let's take a look at each technique, ...