Apache Flume is a framework based on streaming data flows for collecting, aggregating, and transferring large quantities of data. Flume is an efficient and reliable distributed service. A unit of data flow in Flume is called an event. The main components in Flume architecture are Flume source, Flume channel, and Flume sink, all of which are hosted by a Flume agent. A Flume source consumes events from an external source such as a log file or a web server. A Flume source stores the events it receives in a passive data store called a Flume channel. Examples of Flume channel types are ...
© Deepak Vohra 2016
Deepak Vohra, Practical Hadoop Ecosystem, 10.1007/978-1-4842-2199-0_6
6. Apache Flume
Deepak Vohra1
(1)Apt 105, White Rock, British Columbia, Canada
Get Practical Hadoop Ecosystem: A Definitive Guide to Hadoop-Related Frameworks and Tools now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.