After exploring the dataset, it is time to build our classifier so that we can recognize a state of heart disease from the results of clinical tests. We prepare the data before proceeding. We split the starting data into two sets—a training set and a test set. The training set is used to train a classification model, and the test set to used to test model performance.
To split the data, the scikit-learn library has been used—more specifically, the sklearn.model_selection.train_test_split() function has been used. This function quickly computes a random split into training and test sets. Let's start by importing the function:
from sklearn.model_selection import train_test_split
Now, we have to split the two DataFrames: ...