Apache Flume is an open source system that was primarily developed to solve the following use case:
How to efficiently and reliably collect large amounts of Log-related data from different systems, normalize them, and store them in a reliable store.
At first glance, the use case seems simple enough to question the need of developing an entire system around it. But when developing a distributed, reliable, and fault-tolerant system that spans multiple machines running in different regions, a simple use case of aggregating logs from different machines and different application instances suddenly seem humongous.
You must keep a lot of things in mind. For example:
- All the systems that deploy your distributed application should have ...