June 2017
Beginner to intermediate
576 pages
15h 22m
English
To build the characteristics of the Spark dataframe, we will first take a small dataset, determine the basic statistical properties of this dataset, and then build a Spark dataframe based upon these properties.
The Pima Indians diabetes dataset contains the following attributes:
The data is a publicly available dataset. In fact, there are several versions of this dataset available. We will ...