Next, we will create a slightly more complex streaming program. In Chapter 1, Getting Up and Running with Spark, we calculated a few metrics on our dataset of product purchases. These included the total number of purchases, the number of unique users, the total revenue, and the most popular product (together with its number of purchases and total revenue).
In this example, we will compute the same metrics on our stream of purchase events. The key difference is that these metrics will be computed per batch and printed out.
We will define our streaming application code here:
/** * A more complex Streaming app, which computes statistics and prints the results for each batch in a DStream */ object StreamingAnalyticsApp ...