Artificial Intelligence for Big Data
by Anand Deshpande, Manish Kumar, Albenzo Coletta, Giancarlo Zaccone
Spark Streaming
Spark is a general purpose, in-memory, distributed computation engine. The Spark Streaming API is an extension of the core Spark library which was designed with scalability, high throughput, and fault tolerance for streaming (unbounded) data goals in mind. Spark Streaming integrates with a variety of data sources such as TCP network sockets, HTTP server logs, kafka producers, social media streams, and so on.
The streams and complex events are processed with generic operations such as MapReduce, join, and windowing. The data in motion can be analysed, aggregated, filtered, and sent to downstream applications, persistent storage, or live dashboards. Machine learning and graph processing algorithms and APIs can be applied to ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access