Since we've explored our dataset, let's take a look at how machine learning algorithms can help us to define whether a person has cancer.
The following steps will help you to better understand the machine learning algorithm:
- The first step that we need to perform is to split our dataset into X and Y datasets for training. We won't train all of the available data, as we need to save some for our validation step. This will help us to determine how well these algorithms can generalize to new data, and not just how well they know the training data.
- Our X data will contain all of the variables, except for the class column, and our Y data is going to be the class column, which is the classification of ...