Integrating the batch and real-time views
The final step to complete the big data architecture is largely complete already and is surprisingly simple, as is the case with all good functional style designs.
How to do it…
This recipe involves simply extending the existing TF-IDF DRPC query that we defined in Chapter 4, Distributed Remote Procedure Calls. We need three new state sources that represents the D, DF, and TF values computed in the Batch layer. We will combine the values from these states with the existing state before performing the final TF-IDF calculation.
- Start from the inside out by creating the combination function called
storm.cookbook.tfidf.functionpackage and implement the logic to combine two versions of ...