Chapter 5. Bolts
As you have seen, bolts are key components in a Storm cluster. In this chapter, you’ll look at a bolt’s life cycle, some strategies for bolt design, and some examples of how to implement them.
Bolt Lifecycle
A bolt is a component that takes tuples as input and produces tuples
as output. When writing a bolt, you will usually implement the IRichBolt
interface. Bolts are created on the
client machine, serialized into the topology, and submitted to the master
machine of the cluster. The cluster launches workers that deserialize the
bolt, call prepare
on it, and then
start processing tuples.
Tip
To customize a bolt, you should set parameters in its constructor and save them as instance variables so they will be serialized when submitting the bolt to the cluster.
Bolt Structure
Bolts have the following methods:
declareOutputFields(OutputFieldsDeclarer declarer)
Declare the output schema for this bolt
prepare(java.util.Map stormConf, TopologyContext context, OutputCollector collector)
Called just before the bolt starts processing tuples
execute(Tuple input)
Process a single tuple of input
cleanup()
Called when a bolt is going to shut down
Take a look at an example of a bolt that will split sentences into words:
class
SplitSentence
implements
IRichBolt
{
private
OutputCollector
collector
;
public
void
prepare
(
Map
conf
,
TopologyContext
context
,
OutputCollector
collector
)
{
this
.
collector
=
collector
;
}
public
void
execute
(
Tuple
tuple
)
{
String
sentence
=
tuple
.
getString
(
0
);
for
(
String
Get Getting Started with Storm now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.