Learning Apache Apex
by Ananth Gundabattula, Thomas Weise, Munagala V. Ramanath, David Yan, Kenneth Knowles
Processing guarantees
The Apex engine by default guarantees that data is processed at-least-once and that state updates within the DAG occur exactly-once. With respect to state mutation through interaction with external systems, the results depend on the connector (refer to Chapter 3, The Apex Library). Connectors that support the exactly-once results include Files, Kafka, JDBC, Cassandra, and all others where the write operations are, or can be made, idempotent. We will look at an example application in the next section.
In distributed systems, a guarantee of the exactly-once processing is not really possible since nodes may go down at any time and when they are restored, some reprocessing of prior data, however minimal, must occur in order ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access