O'Reilly logo

Practical Predictive Analytics by Ralph Winters

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Performing some exploratory analysis on positives

Before we move on to exploring the entire Spark dataframe, we can look at some of the data already generated for positive cases. As you may recall from the prior chapter, this is stored in the Spark dataframe out_sd1.

We have generated some random sample bins specifically so that we can do some exploratory analysis.

We can use the filter command to extract random sample 1, and take the first 1,000 records:

  • The filter is a SparkR command that allows you to subset a Spark dataframe
  • The display command is a databricks command that is equivalent to the View command we have previously used and you can also use the head function as well to limit the number of rows that are displayed:

This code ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required