We will now look at how to use a random forest to train our model:
- We start by splitting our target and feature variables:
predictor= df_creditcarddata.iloc[:, df_creditcarddata.columns != 'default.payment.next.month']target= df_creditcarddata.iloc[:, df_creditcarddata.columns == 'default.payment.next.month']
- We separate the numerical and non-numerical variables in our feature set:
# save all categorical columns in listcategorical_columns = [col for col in predictor.columns.values if predictor[col].dtype == 'object']# dataframe with categorical featuresdf_categorical = predictor[categorical_columns]# dataframe with numerical featuresdf_numeric = predictor.drop(categorical_columns, axis=1)
- We dummy code the categorical ...