Chapter 6. Interceptors, ETL, and Routing

The final piece of functionality necessary in your data processing pipeline is the ability to inspect and transform events in flight. This can be accomplished using interceptors. Interceptors, as we discussed in Chapter 1, Overview and Architecture, can be inserted after a source or before a sink.


An interceptor's functionality can be summed up by this method:

public Event intercept(Event event);

It is passed as a Flume event and it returns as a Flume event. It may do nothing; that is, the same unaltered event is returned. Often, it alters the event in some useful way. If null is returned, the event is dropped.

To add interceptors to a source, simply add the interceptors property to the named source. ...

Get Apache Flume: Distributed Log Collection for Hadoop now with the O’Reilly learning platform.

O’Reilly members experience live online training, plus books, videos, and digital content from nearly 200 publishers.