© Deepak Vohra 2016

Deepak Vohra, Practical Hadoop Ecosystem, 10.1007/978-1-4842-2199-0_6

6. Apache Flume

Deepak Vohra

(1)Apt 105, White Rock, British Columbia, Canada

Apache Flume is a framework based on streaming data flows for collecting, aggregating, and transferring large quantities of data. Flume is an efficient and reliable distributed service. A unit of data flow in Flume is called an event. The main components in Flume architecture are Flume source, Flume channel, and Flume sink, all of which are hosted by a Flume agent. A Flume source consumes events from an external source such as a log file or a web server. A Flume source stores the events it receives in a passive data store called a Flume channel. Examples of Flume channel types are ...

Get Practical Hadoop Ecosystem: A Definitive Guide to Hadoop-Related Frameworks and Tools now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.