How to do it... 

  1. We split our target and feature variables:
from sklearn.model_selection import train_test_splitX = df_creditdata.iloc[:,0:23]Y = df_creditdata['default.payment.next.month']
  1. Split the data into training, validation, and testing subsets:
# We first split the dataset into train and test subsetX_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=0.1, random_state=1)# Then we take the train subset and carve out a validation set from the sameX_train, X_val, Y_train, Y_val = train_test_split(X_train, Y_train, test_size=0.2, random_state=1)
  1. Check the dimensions of each subset to ensure that our splits are correct:
# Dimensions for train subsetsprint(X_train.shape)print(Y_train.shape)# Dimensions for validation ...

Get Ensemble Machine Learning Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.