Get full access to Architecting Data-Intensive Applications and 60K+ other titles, with a free 10-day trial of O'Reilly.

There are also live events, courses curated by job role, and more.

Start your free trial

Apache Flume

Apache Flume is an open source system that was primarily developed to solve the following use case:

How to efficiently and reliably collect large amounts of Log-related data from different systems, normalize them, and store them in a reliable store.

At first glance, the use case seems simple enough to question the need of developing an entire system around it. But when developing a distributed, reliable, and fault-tolerant system that spans multiple machines running in different regions, a simple use case of aggregating logs from different machines and different application instances suddenly seem humongous.

You must keep a lot of things in mind. For example:

All the systems that deploy your distributed application should have ...

Get Architecting Data-Intensive Applications now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Don’t leave empty-handed

Get Mark Richards’s Software Architecture Patterns ebook to better understand how to design components—and how they should interact.

It’s yours, free.

Get it now

Check it out now on O’Reilly

Dive in for free with a 10-day trial of the O’Reilly learning platform—then explore all the other resources our members count on to build skills and solve problems every day.

Start your free trial Become a member now