Book description
NoneTable of contents
- Foreword
- Preface
- I. Fundamentals of Stream Processing with Apache Spark
- 1. Introducing Stream Processing
- 2. Stream-Processing Model
- 3. Streaming Architectures
- 4. Apache Spark as a Stream-Processing Engine
-
5. Sparkâs Distributed Processing Model
- Running Apache Spark with a Cluster Manager
- Sparkâs Own Cluster Manager
- Understanding Resilience and Fault Tolerance in a Distributed System
- Data Delivery Semantics
- Microbatching and One-Element-at-a-Time
- Bringing Microbatch and One-Record-at-a-Time Closer Together
- Dynamic Batch Interval
- Structured Streaming Processing Model
- 6. Sparkâs Resilience Model
- A. References for Part I
- II. Structured Streaming
- 7. Introducing Structured Streaming
- 8. The Structured Streaming Programming Model
- 9. Structured Streaming in Action
- 10. Structured Streaming Sources
- 11. Structured Streaming Sinks
- 12. Event TimeâBased Stream Processing
- 13. Advanced Stateful Operations
- 14. Monitoring Structured Streaming Applications
- 15. Experimental Areas: Continuous Processing and Machine Learning
- B. References for Part II
- III. Spark Streaming
- 16. Introducing Spark Streaming
- 17. The Spark Streaming Programming Model
- 18. The Spark Streaming Execution Model
- 19. Spark Streaming Sources
- 20. Spark Streaming Sinks
- 21. Time-Based Stream Processing
- 22. Arbitrary Stateful Streaming Computation
- 23. Working with Spark SQL
- 24. Checkpointing
- 25. Monitoring Spark Streaming
- 26. Performance Tuning
- C. References for Part III
- IV. Advanced Spark Streaming Techniques
-
27. Streaming Approximation and Sampling Algorithms
- Exactness, Real Time, and Big Data
- The Exactness, Real-Time, and Big Data triangle
- Approximation Algorithms
- Hashing and Sketching: An Introduction
- Counting Distinct Elements: HyperLogLog
- Counting Element Frequency: Count Min Sketches
- Ranks and Quantiles: T-Digest
- Reducing the Number of Elements: Sampling
- 28. Real-Time Machine Learning
- D. References for Part IV
- V. Beyond Apache Spark
- 29. Other Distributed Real-Time Stream Processing Systems
- 30. Looking Ahead
- E. References for Part V
- Index
Product information
- Title: Stream Processing with Apache Spark
- Author(s):
- Release date:
- Publisher(s): O'Reilly Media, Inc.
- ISBN: None
You might also like
book
Generative Deep Learning, 2nd Edition
Generative AI is the hottest topic in tech. This practical book teaches machine learning engineers and …
book
Foundations of Scalable Systems
In many systems, scalability becomes the primary driver as the user base grows. Attractive features and …
book
Python for Data Analysis, 3rd Edition
Get the definitive handbook for manipulating, processing, cleaning, and crunching datasets in Python. Updated for Python …
book
Designing Data-Intensive Applications
Data is at the center of many challenges in system design today. Difficult issues need to …