Creating your first topology

Now, we'll create a Storm topology that breaks sentences into words and then counts the number of occurrences of each word. Implementing this topology in Storm requires the following components:

  • Sentence spout (randomsentence.py): A topology always begins with a spout; that's how data gets into Storm. The sentence spout will emit an infinite stream of sentences.
  • Splitter bolt (splitsentence.py): This receives sentences and splits them into words.
  • Word count bolt (wordcount.py): This receives words and counts the occurrences. For each word processed, output the word along with the number of occurrences.

The following figure shows how data flows through the topology:

Word count topology

Now that we've seen the basic data flow, ...

Get Building Python Real-Time Applications with Storm now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.