Video description
Today, organizations have a difficult time working with huge numbers of datasets. In addition, data processing and analyzing need to be done in real time to gain insights. This is where data streaming comes in. As big data is no longer a niche topic, having the skillset to architect and develop robust data streaming pipelines is a must for all developers. In addition, they also need to think of the entire pipeline, including the trade-offs for every tier.
This course starts by explaining the blueprint architecture for developing a completely functional data streaming pipeline and installing the technologies used. With the help of live coding sessions, you will get hands-on with architecting every tier of the pipeline. You will also handle specific issues encountered working with streaming data. You will input a live data stream of Meetup RSVPs that will be analyzed and displayed via Google Maps.
By the end of the course, you will have built an efficient data streaming pipeline and will be able to analyze its various tiers, ensuring a continuous flow of data.
Please note: The link meetup.com is not functional anymore. Here are a few alternatives for free WebSocket streaming endpoints :
Finnhub offers a free WebSocket endpoint for streaming real-time stock market data. https://finnhub.io/docs/api/websocket-trades
EOD Historical Data also offers a free WebSocket endpoint for streaming real-time stock market data. https://github.com/EodHistoricalData
PieSocket offers a free tier of WebSockets that allows you to connect up to 100 clients. https://www.piesocket.com/blog/tag/realtime
Socket.IO also offers a free tier of WebSockets that allows you to connect up to 100 clients. https://socket.io/
They can be used by adapting the course code. The principles are the same.
What You Will Learn
- Attain a solid foundation in the most powerful and versatile technologies involved in data streaming: Apache Spark and Apache Kafka
- Form a robust and clean architecture for a data streaming pipeline
- Implement the correct tools to bring your data streaming architecture to life
- Isolate the most problematic tradeoff for each tier involved in a data streaming pipeline
- Query, analyze, and apply machine learning algorithms to collected data
- Display analyzed pipeline data via Google Maps on your web browser
- Discover and resolve difficulties in scaling and securing data streaming applications
Audience
This course is perfect for Java developers and architects who want to design and write data streaming pipelines. Having knowledge of the Spring framework will be an added benefit.
About The Author
Anghel Leonard: Anghel Leonard is a Chief Technology Strategist and independent consultant with 20+ years of experience in the Java ecosystem. In daily work, he is focused on architecting and developing Java distributed applications that empower robust architectures, clean code, and high-performance. Also passionate about coaching, mentoring and technical leadership. He is the author of several books, videos and dozens of articles related to Java technologies.
Publisher resources
Table of contents
- Chapter 1 : Introducing Data Streaming Architecture
- Chapter 2 : Deployment of Collection and Message Queuing Tiers
- Chapter 3 : Proceeding to the Data Access Tier
-
Chapter 4 : Implementing the Analysis Tier
- Diving into the Analysis Tier
- Streaming Algorithms For Data Analysis
- Introducing Our Analysis Tier – Apache Spark
- Plug-in Spark Analysis Tier to Our Pipeline
- Brief Overview of Spark RDDs
- Spark Streaming
- DataFrames, Datasets and Spark SQL
- Spark Structured Streaming
- Machine Learning in 7 Steps
- MLlib (Spark ML)
- Spark ML and Structured Streaming
- Spark GraphX
- Chapter 5 : Mitigate Data Loss between Collection, Analysis and Message Queuing Tiers
Product information
- Title: Data Stream Development with Apache Spark, Kafka, and Spring Boot
- Author(s):
- Release date: November 2018
- Publisher(s): Packt Publishing
- ISBN: 9781789539585
You might also like
video
Building an End-to-End Batch Data Pipeline with Apache Spark
Explore Big Data architectures and the tools you can leverage to build an end-to-end data platform. …
video
Kafka Streams with Spring Cloud Stream
Kafka Streams with Spring Cloud Streams will help you understand stream processing in general and apply …
video
Apache Kafka Series - Kafka Streams for Data Processing
The new volume in the Apache Kafka Series! Learn the Kafka Streams data-processing library, for Apache …
video
Apache Spark with Java - Learn Spark from a Big Data Guru
This course covers all the fundamentals of Apache Spark with Java and teaches you everything you …