Introducing the log analysis topology

With the means to write our log data to Kafka, we're ready to turn our attention to the implementation of a Trident topology to perform the analytical computation. The topology will perform the following operations:

  1. Receive and parse the raw JSON log event data.
  2. Extract and emit necessary fields.
  3. Update an exponentially-weighted moving average function.
  4. Determine if the moving average has crossed a specified threshold.
  5. Filter out events that do not represent a state change (for example, rate moved above/below threshold).
  6. Send an instant message (XMPP) notification.

The topology is depicted in the following diagram with the Trident stream operations at the top and stream processing components at the bottom:

Kafka spout ...

Get Storm Blueprints: Patterns for Distributed Real-time Computation now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.