Configuring the parallelism of a topology

There are a number of components in a Storm topology. The throughput (processing speed) of the topology is decided by the number of instances of each component running in parallel. This is known as the parallelism of a topology. Let's first look at the processes or components responsible for the parallelism feature of the Storm cluster.

The worker process

A Storm topology is executed across multiple nodes in the Storm cluster. Each of the nodes in the cluster can run one or more JVMs called worker processes that are responsible for processing a part of the topology.

A Storm cluster can run multiple topologies at the same time. A worker process is bound to one of these topologies and can execute multiple ...

Get Learning Storm now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.