Integrating the batch and real-time views

The final step to complete the big data architecture is largely complete already and is surprisingly simple, as is the case with all good functional style designs.

How to do it…

This recipe involves simply extending the existing TF-IDF DRPC query that we defined in Chapter 4, Distributed Remote Procedure Calls. We need three new state sources that represents the D, DF, and TF values computed in the Batch layer. We will combine the values from these states with the existing state before performing the final TF-IDF calculation.

  1. Start from the inside out by creating the combination function called BatchCombiner within the storm.cookbook.tfidf.function package and implement the logic to combine two versions of ...

Get Storm Real-time Processing Cookbook now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.