O'Reilly logo

Practical Predictive Analytics by Ralph Winters

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Extracting the Pima Indians diabetes dataset

After running the following code, we will have the PimaIndiansDiabetes R dataframe loaded and we will run the usual str() and summary() functions. Note that we need to first install the mlbench package to retrieve the data that is contained within the package.

At this point, no Spark directives are being introduced. Even though we are running in a databricks environment, the code is pure R, and you can replicate this code in your regular R environment as well.

# load the library devtools::install_github("cran/mlbench") library(mlbench) data(PimaIndiansDiabetes) str(PimaIndiansDiabetes) summary(PimaIndiansDiabetes) 

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required