Skip to Main Content
Using Flume
book

Using Flume

by Hari Shreedharan
September 2014
Intermediate to advanced content levelIntermediate to advanced
238 pages
6h 17m
English
O'Reilly Media, Inc.
Content preview from Using Flume

Chapter 7. Getting Data into Flume*

So far, we’ve discussed the internals of Flume agents and how to configure the various components that make up an agent. In this chapter, we will look at the various methods by which data can be sent to one or more Flume agents from a client application. Flume has two programmatic ways through which data can be sent to Flume agents: the Flume SDK and the Embedded Agent API. Flume also comes bundled with log4j appenders that can be used to send data from applications to Flume agents.

Building Flume Events

Before we discuss the API that is used to send data to Flume agents, let’s look at how Flume events are created. As we discussed in Chapter 2, events are the basic form of representation of data in Flume. Each Flume event contains a map of headers and a body, which is the payload represented as a byte array. The Event interface is shown in Example 7-1.

Example 7-1. Event interface
package org.apache.flume;
public interface Event {
  public Map<String, String> getHeaders();
  public void setHeaders(Map<String, String> headers);
  public byte[] getBody();
  public void setBody(byte[] body);
}

As is evident, the internal representation of data within different implementations of the Event interface might differ as long as it exposes the headers and body in the format specified by the interface. In general, most applications build events using Flume’s EventBuilder API. The EventBuilder API provides a few static methods to build events. In all cases, ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Apache Flume: Distributed Log Collection for Hadoop - Second Edition - Second Edition

Apache Flume: Distributed Log Collection for Hadoop - Second Edition - Second Edition

Steven Hoffman
Java Data Objects

Java Data Objects

David Jordan, Craig Russell

Publisher Resources

ISBN: 9781491905326ErrataSupplemental Content