October 2018
Intermediate to advanced
172 pages
4h 6m
English
One of the main constraints of scikit-learn is that you cannot implement the machine learning algorithms on columns that are categorical in nature. For example, the type column in our dataset has five categories:
These categories will have to be encoded into numbers that scikit-learn can make sense of. In order to do this, we have to implement a two-step process.
The first step is to convert each category into a number: CASH-IN = 0, CASH-OUT = 1, DEBIT = 2, PAYMENT = 3, TRANSFER = 4. We can do this by using the following code:
#Package Importsfrom sklearn.preprocessing import LabelEncoderfrom sklearn.preprocessing import OneHotEncoder#Converting the type column ...
Read now
Unlock full access