January 2020
Beginner to intermediate
372 pages
10h
English
Let's first import the necessary Python libraries and get the dataset ready:
import numpy as npimport pandas as pdfrom sklearn.model_selection import train_test_splitfrom feature_engine.categorical_encoders import RareLabelCategoricalEncoder
data = pd.read_csv('creditApprovalUCI.csv')X_train, X_test, y_train, y_test = train_test_split( data.drop(labels=['A16'], axis=1), data['A16'],test_size=0.3, random_state=0)
X_train['A7'].value_counts() / len(X_train)
We can see the percentage of observations per category of
Read now
Unlock full access