Understanding the CDH components
As mentioned earlier, there are several top-level Apache open source projects that are part of CDH. Let's discuss these components in detail.
CDH comes with Apache Hadoop, a system that we have already been introduced to, for high-volume storage and computing. The subcomponents that are part of Hadoop are HDFS, Fuse-DFS, MapReduce, and MapReduce 2 (YARN). Fuse-DFS is a module that helps to mount HDFS to the user space. Once mounted, HDFS will be accessible like any other traditional filesystem.
Apache Flume NG
Apache Flume NG Version 1.x is a distributed framework that handles the collection and aggregation of large amounts of log data. This project was primarily built to handle streaming data. Flume ...