O'Reilly logo

Practical Predictive Analytics by Ralph Winters

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Concatenating the positive and negative cases into a single Spark dataframe

We now have two separate Spark dataframes corresponding to positive and negative cases. For certain types of analysis, it would make sense to keep the outcomes separate; however, for illustration purposes, we will combine them into one single dataset using the unionAll() function.

out_sd <- unionAll(out_sd1, out_sd2) nrow(out_sd) 

The output from nrow indicates a total of 768,000 rows. This number represents our original 768 rows which has been multiplied by a factor of 1000:

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required