7Data Analytics
After reading this chapter, you should be able to
- Learn data ingestion process
- Transfer large data sets
- Apply methods to make data meaningful
- Reason about detecting anomalies
- Explain visualizing data
Big Data analytics explores a vast amount of data to uncover patterns, insights, correlations within data. Big Data analytics gives organizations opportunities and visibility of the organization. It can drive new business ideas, cost reduction, customer engagement, and better decision‐making. Thus, Big Data analytics is one of the essential parts of a modern Big Data platform. Big Data analytics involves many steps such as data ingestion, data cleansing, data transferring, data transforming, data consolidation, data scheduling, dependency management, and anomaly detection. In this chapter, we will visit each one of these Big Data analytics topics.
7.1 Log Collection
One of the fundamental sources of data analytics is log collection. Many systems, such as web servers, IoT devices, and applications, generate log files. Log files can be in any form. They need to be filtered, cleansed and prepared for consumption. Before the time of microservices and containers, logging was rather straightforward. The web servers logged files into a directory in a rotating fashion. An external component collected logs from the log directory with various sets of tools. Nonetheless, microservices with containers made things a bit more complicated. One of the key differences is the ...