Skip to Content
Data Lake for Enterprises
book

Data Lake for Enterprises

by Vivek Mishra, Tomcy John, Pankaj Misra
May 2017
Beginner to intermediate
596 pages
15h 2m
English
Packt Publishing
Content preview from Data Lake for Enterprises

Flume Channel

A channel is a mechanism used by the Flume agent to transfer data from source to sink. The events are persisted in the channel and until it is delivered/taken away by a sink, they reside in the channel. This persistence in channel allows sink to retry for each event in case there is a failure while persisting data to the real store (HDFS).

Channels can be broadly categorized into two:

  1. In-memory: The events are available until the channel component is alive:
    • Queue: In-memory queues in the channel. This has the lowest latency time for processing because the events are persisted in memory.
  2. Durable: Even after the component is dead, the event persisted is available, and when the component becomes online, these events will be ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

The Enterprise Big Data Lake

The Enterprise Big Data Lake

Alex Gorelik
Operationalizing the Data Lake

Operationalizing the Data Lake

Holden Ackerman, Jon King
Data Lakes

Data Lakes

Anne Laurent, Dominique Laurent, Cédrine Madera

Publisher Resources

ISBN: 9781787281349Supplemental Content