June 2017
Beginner to intermediate
576 pages
15h 22m
English
We now have two separate Spark dataframes corresponding to positive and negative cases. For certain types of analysis, it would make sense to keep the outcomes separate; however, for illustration purposes, we will combine them into one single dataset using the unionAll() function.
out_sd <- unionAll(out_sd1, out_sd2) nrow(out_sd)
The output from nrow indicates a total of 768,000 rows. This number represents our original 768 rows which has been multiplied by a factor of 1000:
