Implementing the design
Let's first examine the real-time portion of the system beginning with the spout through to the Druid persistence. The topology is straightforward and mimics topologies we have written in previous chapters.
The following are the critical lines of the topology:
TwitterSpout spout = new TwitterSpout(); Stream inputStream = topology.newStream("nlp", spout); try { inputStream.each(new Fields("tweet"), new TweetSplitterFunction(), new Fields("word")) .each(new Fields("searchphrase", "tweet", "word"), new WordFrequencyFunction(), new Fields("baseline")) .each(new Fields("searchphrase", "tweet", "word", "baseline"), new PersistenceFunction(), new Fields()) .partitionPersist(new DruidStateFactory(), new Fields("searchphrase", "tweet", ...
Get Storm Blueprints: Patterns for Distributed Real-time Computation now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.