Summary
In this chapter, we imported data from the UCI repository. We named the columns (or features), and then put them into a pandas DataFrame. We preprocessed our data and removed the ID column. We also explored the data, so that we would know more about it. We used the describe function, which gave us features such as the mean, the maximum, the minimum, and the different quartiles. We also created some histograms (so that we could understand the distributions of the different features) and a scatterplot matrix (so that we could look for linear relationships between the variables).
We then split our dataset up into a training set and a testing validation set. We implemented some testing parameters, built a KNN classifier and an SVC, and ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access