Chapter 17. Micro-batch stream processing: Illustration

This chapter covers

  • Trident, Apache Storm’s micro-batch-processing API
  • Integrating Kafka, Trident, and Cassandra
  • Fault-tolerant task local state

In the last chapter you learned the core concepts of micro-batch processing. By processing tuples in a series of small batches, you can achieve exactly-once processing semantics. By maintaining a strong ordering on the processing of batches and storing the batch ID information with your state, you can know whether or not the batch has been processed before. This allows you to avoid ever applying updates multiple times, thereby achieving exactly-once semantics.

You saw how with some minor extensions pipe diagrams could be used to represent ...

Get Big Data now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.