Learning Apache Apex
by Ananth Gundabattula, Thomas Weise, Munagala V. Ramanath, David Yan, Kenneth Knowles
State management
Managing the state of the streaming pipeline in a way that scales (also dynamically) and guarantees accurate results (even in the event of failures), is one of the key challenges in building a streaming platform. We discussed in Chapter 5, Fault Tolerance and Reliability how Apex manages the state, and in Chapter 4, Scalability, Low Latency, and Performance how it also supports elasticity (or dynamic scaling). Apex has been a pioneer in this area. Apache Flink is actually quite similar in its checkpointing approach, and recently has built interesting capabilities to utilize its distributed state snapshots for pipeline upgrades (for a good primer on state management in Flink refer to http://www.vldb.org/pvldb/vol10/p1718-carbone.pdf ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access