The dataset

To begin with, let's open Command Prompt and execute the following command:

cd tutorialjupyter lab

This will take us to the tutorial folder. From here, we can open up JupyterLab. This folder is going to be empty right now, but it is where we will be completing this tutorial.

The dataset we're going to use is the heart disease dataset from the UCI repository. You can download this from archive.ics.uci.edu/ml/machine-learning-databases/heart-disease/. It has around 303 patients collected from the Cleveland Clinic Foundation. They have also added data from other places as well, but we are only going to look at data from Cleveland for now. If you go over to the Data folder, you'll see that we've got lot's of different options:

Get Machine Learning for Healthcare Analytics Projects now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.