O'Reilly logo

Machine Learning with Spark - Second Edition by Nick Pentreath, Manpreet Singh Ghotra, Rajdeep Dua

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Structured Streaming

With Spark version 2.0 we have structured streaming which states that the output of the application is equal to executing a batch job on a prefix of the data. Structured Streaming handles consistency and reliability within the engine and in interactions with external systems. Structured Stream is a simple data frame and dataset API.

Users provide the query they want to run along with the input and output locations. The system then executes the query incrementally, maintaining enough state to recover from failure, keeping the results consistent in external storage, and so on.

Structured Streaming promises a much simpler model for building real-time applications, built on the features that work best in Spark Streaming. ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required