Analyze data in real-time using the Apache Spark Streaming API
About This Video
Understand the concepts and problems behind streaming processing.
Get to know the Apache Spark Streaming API and create jobs that analyze data in near real time.
Analyze traffic in real time using Spark Streaming on a web page while users are browsing it.
Spark is the technology that allows us to perform big data processing in the MapReduce paradigm very rapidly, due to performing the processing in memory without the need for extensive I/O operations.
Recently, the streaming approach to processing events in near real time became more widely adopted and more necessary. In this course, you will learn how to handle big amount of unbounded infinite streams of data. You will analyze data and draw conclusions from it. Furthermore, we will look at common problems when processing event streams: sorting, watermarks, deduplication, and keeping state (for example, user sessions). You will also implement streaming processing using Spark Streaming and analyze traffic on a web page in real time.
Table of contents
- Chapter 1 : Understanding a Spark Streaming
- Chapter 2 : Implementing Stream Processing
- Chapter 3 : Implementing Transformations and Processing Logic
- Title: Real Time Streaming using Apache Spark Streaming
- Release date: June 2017
- Publisher(s): Packt Publishing
- ISBN: 9781788391528