O'Reilly logo

Storm Blueprints: Patterns for Distributed Real-time Computation by Brian O'Neill, P. Taylor Goetz

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Understanding stream groupings

Based on the previous example, you may wonder why we did not bother increasing the parallelism of ReportBolt. The answer is that it does not make any sense to do so. To understand why, you need to understand the concept of stream groupings in Storm.

A stream grouping defines how a stream's tuples are distributed among bolt tasks in a topology. For example, in the parallelized version of the word count topology, the SplitSentenceBolt class was assigned four tasks in the topology. The stream grouping determines which one of those tasks will receive a given tuple.

Storm defines seven built-in stream groupings:

  • Shuffle grouping: This randomly distributes tuples across the target bolt's tasks such that each bolt receives ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required