on-demand course

Data Stream Development with Apache Spark, Kafka, and Spring Boot

with Anghel Leonard

November 2018

Beginner to intermediate

7h 51m

English

Packt Publishing

Closed Captioning available in English

Watch now

Unlock full access

Includes

Badge

Course outline

The Course Overview
6m 22s
Discovering the Data Streaming Pipeline Blueprint Architecture
17m 37s
Analyzing Meetup RSVPs in Real-Time
5m 59s
Running the Collection Tier (Part I – Collecting Data)
20m 40s
Collecting Data Via the Stream Pattern and Spring WebSocketClient API
6m 51s
Explaining the Message Queuing Tier Role
6m 19s
Introducing Our Message Queuing Tier –Apache Kafka
24m 58s
Running The Collection Tier (Part II – Sending Data)
14m 15s
Dissecting the
18m 18s
Introducing Our Data Access Tier – MongoDB
11m 13s
Exploring Spring Reactive
24m 48s
Exposing the Data Access Tier in Browser
9m 46s
Diving into the Analysis Tier
19m 9s
Streaming Algorithms For Data Analysis
29m 14s
Introducing Our Analysis Tier – Apache Spark
18m 19s
Plug-in Spark Analysis Tier to Our Pipeline
9m 48s
Brief Overview of Spark RDDs
25m 7s
Spark Streaming
28m 37s
DataFrames, Datasets and Spark SQL
22m 14s
Spark Structured Streaming
32m 37s
Machine Learning in 7 Steps
20m 51s
MLlib (Spark ML)
25m 18s
Spark ML and Structured Streaming
23m 46s
Spark GraphX
6m 41s
Fault Tolerance (HML)
28m 0s
Kafka Connect
4m 19s
Securing Communication between Tiers
10m 19s

Overview

In this 7-hour course, you will learn to develop a robust data streaming pipeline using Apache Spark, Kafka, and Spring Boot. Through practical, hands-on coding sessions, you'll master how to integrate these technologies to process real-time data effectively.

What I will be able to do after this course

Develop a complete blueprint for end-to-end data streaming pipelines.
Implement robust data collection and queuing systems for real-time processing.
Integrate data access tiers with MongoDB for powerful storage solutions.
Utilize Apache Spark for efficient and scalable data analyses.
Build fault-tolerant systems that minimize data loss in streaming applications.

Course Instructor(s)

Anghel Leonard, a seasoned software engineer and author, has years of expertise in Java development and real-time systems. His teaching style emphasizes clarity and hands-on application, making advanced concepts accessible for learners. With a passion for software optimization, Anghel delivers deep insights and practical skills in this course.

Who is it for?

This course is ideal for Java developers and software architects looking to expand their expertise into data streaming technologies. If you are keen to enhance your skills in working with real-time data and aim to design efficient streaming pipelines, this course is well-suited for you. Prior experience with the Spring Framework is beneficial but not mandatory, as the course provides clear guidance throughout.

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Watch now

Unlock full access

More than 5,000 organizations count on O’Reilly

O’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.

Julian F.

Head of Cybersecurity

I wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.

Addison B.

Field Engineer

I’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.

Amir M.

Data Platform Tech Lead

I'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.

Mark W.

Embedded Software Engineer

Building Data Streaming Applications with Apache Kafka

Publisher Resources

ISBN: 9781789539585

Cloud Computing

Data Engineering

Data Science

AI & ML

Programming Languages

Software Architecture

IT/Ops

Security

Design

Business

Soft Skills