O'Reilly logo

Practical Predictive Analytics by Ralph Winters

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Covariance and correlation functions

Correlations and covariances can also be computed directly from a Spark dataframe. For our example, we can see that there is a larger correlation between age and glucose for non-diabetic patients:

First the diabetic outcomes. Correlation is .113 :

Now, the non-diabetic outcomes. Correlation is .22:

For the entire population the correlation is 0.26:

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required