January 2020
Beginner to intermediate
372 pages
10h
English
Let's begin by importing the necessary tools and loading and preparing the data:
import pandas as pdfrom sklearn.model_selection import train_test_splitfrom sklearn.impute import SimpleImputerfrom feature_engine.missing_data_imputers import ArbitraryNumberImputer
data = pd.read_csv('creditApprovalUCI.csv')
X_train, X_test, y_train, y_test = train_test_split( data.drop('A16', axis=1), data['A16'], test_size=0.3, random_state=0)
Normally, we select arbitrary values that are bigger than the maximum value of the distribution.
Read now
Unlock full access