Overview
When data-driven applications fail, identifying the cause is both challenging and time-consuming—especially as data pipelines become more and more complex. Hunting for the root cause of application failure from messy, raw, and distributed logs is difficult for performance experts and a nightmare for data operations teams. This report examines DataOps processes and tools that enable you to manage modern data pipelines efficiently.
Author Ted Malaska describes a data operations framework and shows you the importance of testing and monitoring to plan, rebuild, automate, and then manage robust data pipelines—whether it’s in the cloud, on premises, or in a hybrid configuration. You’ll also learn ways to apply performance monitoring software and AI to your data pipelines in order to keep your applications running reliably.
You’ll learn:
- How performance management software can reduce the risk of running modern data applications
- Methods for applying AI to provide insights, recommendations, and automation to operationalize big data systems and data applications
- How to plan, migrate, and operate big data workloads and data pipelines in the cloud and in hybrid deployment models