Window operations

Spark Streaming provides windowed processing, which allows you to apply transformations over a sliding window of events. The sliding window is created over an interval specified. Every time the window slides over a source DStream, the source RDDs, which fall within the window specification, are combined and operated upon to generate the windowed DStream. There are two parameters that need to be specified for the window:

  • Window length: This specifies the length in interval considered as the window
  • Sliding interval: This is the interval at which the window is created
The window length and the sliding interval must both be a multiple of the block interval.

Shown in the following is an illustration shows a DStream with a sliding ...

Get Scala and Spark for Big Data Analytics now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.