January 2020
Beginner to intermediate
372 pages
10h
English
Let's first import the necessary Python libraries and get the dataset ready:
import pandas as pdfrom sklearn.model_selection import train_test_splitfrom category_encoders import BinaryEncoder
data = pd.read_csv('creditApprovalUCI.csv')X_train, X_test, y_train, y_test = train_test_split( data.drop(labels=['A16'], axis=1), data['A16'],test_size=0.3, random_state=0)
X_train['A7'].unique()
We can see in the output of the preceding code block that A7 has 10 different categories:
array(['v', 'ff', 'h', 'dd', 'z', 'bb', 'j', 'Missing', 'n', 'o'], dtype=object)
Read now
Unlock full access