Using DRPC to complete the required processing

A classic design consideration within data systems is choosing an appropriate balance between precomputation and on-the-fly computation. Precomputation is often preferable; however, it isn't always possible. Either because the amount of potential data is far too large in practical terms, or because the final result is dependent on a point-in-time perspective of the data that is not possible to precompute.

In the previous chapter, we emitted a constant stream of TF-IDF values based on the documents received from Twitter and the Internet. The TF-IDF value is perfectly correct at the time when it is emitted; however, as time passes the value that was emitted is potentially invalidated because it is coupled ...

Get Storm Real-time Processing Cookbook now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.