June 2017
Beginner to intermediate
576 pages
15h 22m
English
Before we move on to exploring the entire Spark dataframe, we can look at some of the data already generated for positive cases. As you may recall from the prior chapter, this is stored in the Spark dataframe out_sd1.
We have generated some random sample bins specifically so that we can do some exploratory analysis.
We can use the filter command to extract random sample 1, and take the first 1,000 records:
This code ...