The Cleveland Heart Disease database is a published dataset used by ML researchers. The dataset contains more than a dozen fields, and experiments with the Cleveland database have concentrated on simply attempting to distinguish presence (value 1,2,3) and absence (value 0) of the disease (in the goal column, 14th column).
The Cleveland Heart Disease dataset is available at http://archive.ics.uci.edu/ml/machine-learning-databases/heart-disease/processed.cleveland.data.
The dataset contains the following attributes (age, sex, cp, trestbps, chol, fbs, restecg, thalach, exang, oldpeak, slope, ca, thal, num) that are depicted as the header of the table below:
How to do it...
For a detailed explanation on the individual attributes, refer ...
Get Apache Spark 2.x Machine Learning Cookbook now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.