Chapter 4. Sinks and Sink Processors

By now you should have a pretty good idea where the sink fits into the Flume architecture. In this chapter we will learn about the most used sink with Hadoop, the HDFS sink. The general architecture of Flume supports many other sinks we won't have space to cover all of them in this book. Some come bundled with Flume that can write to HBase, IRC, ElasticSearch, and as we saw in Chapter 2, Flume Quick Start, a log4j and file sink. Other sinks are available on the Internet that can be used to write data to MongoDB, Cassandra, RabbitMQ, Redis, and just about any other data store you can think of. If you can't find a sink that suits your needs, you could write one easily by extending the org.apache.flume.sink.Abstractsink ...

Get Apache Flume: Distributed Log Collection for Hadoop now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.