Skip to Main Content
Big Data
book

Big Data

by James Warren, Nathan Marz
April 2015
Beginner to intermediate content levelBeginner to intermediate
328 pages
11h 1m
English
Manning Publications
Content preview from Big Data

Chapter 17. Micro-batch stream processing: Illustration

This chapter covers

  • Trident, Apache Storm’s micro-batch-processing API
  • Integrating Kafka, Trident, and Cassandra
  • Fault-tolerant task local state

In the last chapter you learned the core concepts of micro-batch processing. By processing tuples in a series of small batches, you can achieve exactly-once processing semantics. By maintaining a strong ordering on the processing of batches and storing the batch ID information with your state, you can know whether or not the batch has been processed before. This allows you to avoid ever applying updates multiple times, thereby achieving exactly-once semantics.

You saw how with some minor extensions pipe diagrams could be used to represent ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Big Data For Dummies

Big Data For Dummies

Judith Hurwitz, Alan Nugent, Dr. Fern Halper, Marcia Kaufman
The Self-Service Data Roadmap

The Self-Service Data Roadmap

Sandeep Uttamchandani

Publisher Resources

ISBN: 9781617290343Publisher SupportOtherPublisher WebsiteSupplemental ContentPurchase Link