Chapter 5. Bolts
As you have seen, bolts are key components in a Storm cluster. In this chapter, you’ll look at a bolt’s life cycle, some strategies for bolt design, and some examples of how to implement them.
A bolt is a component that takes tuples as input and produces tuples
as output. When writing a bolt, you will usually implement the
IRichBolt interface. Bolts are created on the
client machine, serialized into the topology, and submitted to the master
machine of the cluster. The cluster launches workers that deserialize the
prepare on it, and then
start processing tuples.
To customize a bolt, you should set parameters in its constructor and save them as instance variables so they will be serialized when submitting the bolt to the cluster.
Bolts have the following methods:
Declare the output schema for this bolt
prepare(java.util.Map stormConf, TopologyContext context, OutputCollector collector)
Called just before the bolt starts processing tuples
Process a single tuple of input
Called when a bolt is going to shut down
Take a look at an example of a bolt that will split sentences into words: