Datadog Cloud Monitoring Quick Start Guide

Book description

A comprehensive guide to rolling out Datadog to monitor infrastructure and applications running in both cloud and datacenter environments

Key Features

  • Learn Datadog to proactively monitor your infrastructure and cloud services
  • Use Datadog as a platform for aggregating monitoring efforts in your organization
  • Leverage Datadog's alerting service to implement on-call and site reliability engineering (SRE) processes

Book Description

Datadog is an essential cloud monitoring and operational analytics tool which enables the monitoring of servers, virtual machines, containers, databases, third-party tools, and application services. IT and DevOps teams can easily leverage Datadog to monitor infrastructure and cloud services, and this book will show you how.

The book starts by describing basic monitoring concepts and types of monitoring that are rolled out in a large-scale IT production engineering environment. Moving on, the book covers how standard monitoring features are implemented on the Datadog platform and how they can be rolled out in a real-world production environment. As you advance, you'll discover how Datadog is integrated with popular software components that are used to build cloud platforms. The book also provides details on how to use monitoring standards such as Java Management Extensions (JMX) and StatsD to extend the Datadog platform. Finally, you'll get to grips with monitoring fundamentals, learn how monitoring can be rolled out using Datadog proactively, and find out how to extend and customize the Datadog platform.

By the end of this Datadog book, you will have gained the skills needed to monitor your cloud infrastructure and the software applications running on it using Datadog.

What you will learn

  • Understand monitoring fundamentals, including metrics, monitors, alerts, and thresholds
  • Implement core monitoring requirements using Datadog features
  • Explore Datadog's integration with cloud platforms and tools
  • Extend Datadog using custom scripting and standards such as JMX and StatsD
  • Discover how proactive monitoring can be rolled out using various Datadog features
  • Understand how Datadog can be used to monitor microservices in both Docker and Kubernetes environments
  • Get to grips with advanced Datadog features such as APM and Security Monitoring

Who this book is for

This book is for DevOps engineers, site reliability engineers (SREs), IT Production engineers, software developers and architects, cloud engineers, system administrators, and anyone looking to monitor and visualize their infrastructure and applications with Datadog. Basic working knowledge of cloud and infrastructure is useful. Working experience of Linux distribution and some scripting knowledge is required to fully take advantage of the material provided in the book.

Table of contents

  1. Datadog Cloud Monitoring Quick Start Guide
  2. Contributors
  3. About the author
  4. About the reviewer
  5. Preface
    1. Who this book is for
    2. What this book covers
    3. To get the most out of this book
    4. Download the example code files
    5. Download the color images
    6. Conventions used
    7. Get in touch
    8. Reviews
  6. Section 1: Getting Started with Datadog
  7. Chapter 1: Introduction to Monitoring
    1. Technical requirements
    2. Why monitoring?
    3. Proactive monitoring
      1. Implementing a comprehensive monitoring solution
      2. Setting up alerts to warn of impending issues
      3. Having a feedback loop
    4. Monitoring use cases
      1. All in a data center
      2. Application in a data center with cloud monitoring
      3. All in the cloud
    5. Monitoring terminology and processes
      1. Host
      2. Agent
      3. Metrics
      4. Up/down status
      5. Check
      6. Threshold
      7. Monitor
      8. Alert
      9. Alert recipient
      10. Severity level
      11. Notification
      12. Downtime
      13. Event
      14. Incident
      15. On call
      16. Runbook
    6. Types of monitoring
      1. Infrastructure monitoring
      2. Platform monitoring
      3. Application monitoring
      4. Business monitoring
      5. Last-mile monitoring
      6. Log aggregation
      7. Meta-monitoring
      8. Noncore monitoring
    7. Overview of monitoring tools
      1. On-premises tools
      2. SaaS solutions
      3. Cloud-native tools
    8. Summary
  8. Chapter 2: Deploying the Datadog Agent
    1. Technical requirements
    2. Installing the Datadog Agent
      1. Runtime configurations
      2. Steps for installing the agent
    3. Agent components
    4. Agent as a container
    5. Deploying the agent – use cases
      1. All on the hosts
      2. Agent on the host monitoring containers
      3. Agent running as a container
    6. Advanced agent configuration
    7. Best practices
    8. Summary
  9. Chapter 3: The Datadog Dashboard
    1. Technical requirements
    2. Infrastructure List
    3. Events
    4. Metrics Explorer
    5. Dashboards
    6. The main Integrations menu
      1. Integrations
      2. APIs
      3. Agent
      4. Embeds
    7. Monitors
      1. Creating a new metric monitor
    8. Advanced features
    9. Summary
  10. Chapter 4: Account Management
    1. Technical requirements
    2. Managing users
    3. Granting custom access using roles
    4. Setting up organizations
    5. Implementing Single Sign-On
    6. Managing API and application keys
    7. Tracking usage
    8. Best practices
    9. Summary
  11. Chapter 5: Metrics, Events, and Tags
    1. Technical requirements
    2. Understanding metrics in Datadog
      1. Metric data
      2. Flush time interval
      3. Metric type
      4. Metric unit
      5. Query
    3. Tagging Datadog resources
      1. Defining tags
      2. Tagging methods
      3. Customizing host tag
      4. Tagging integration metrics
      5. Tags from microservices
      6. Filtering using tags
    4. Defining custom metrics
    5. Monitoring event streams
    6. Searching events
    7. Notifications for events
    8. Generating events
    9. Best practices
    10. Summary
  12. Chapter 6: Monitoring Infrastructure
    1. Technical requirements
    2. Inventorying the hosts
      1. CPU usage
      2. Load averages
      3. Available swap
      4. Disk latency
      5. Memory breakdown
      6. Disk usage
      7. Network traffic
    3. Listing containers
    4. Viewing system processes
    5. Monitoring serverless computing resources
    6. Best practices
    7. Summary
  13. Chapter 7: Monitors and Alerts
    1. Technical requirements
    2. Setting up monitors
    3. Managing monitors
    4. Distributing notifications
    5. Configuring downtime
    6. Best practices
    7. Summary
  14. Section 2: Extending Datadog
  15. Chapter 8: Integrating with Platform Components
    1. Technical requirements
    2. Configuring an integration
    3. Tagging an integration
    4. Reviewing supported integrations
    5. Implementing custom checks
    6. Best practices
    7. Summary
  16. Chapter 9: Using the Datadog REST API
    1. Technical requirements
    2. Scripting Datadog
      1. curl
      2. Python
    3. Reviewing Datadog APIs
      1. Public cloud integration
      2. Dashboards
      3. Downtime
      4. Events
      5. Hosts
      6. Metrics
      7. Monitors
      8. Host tags
    4. Programming with Datadog APIs
      1. The problem
      2. Posting metric data and an event
      3. Creating a monitor
      4. Querying the events stream
    5. Best practices
    6. Summary
  17. Chapter 10: Working with Monitoring Standards
    1. Technical requirements
    2. Monitoring networks using SNMP
    3. Consuming application metrics using JMX
      1. Cassandra as a Java application
      2. Using Cassandra integration
      3. Accessing the Cassandra JMX interface
    4. Working with the DogStatsD interface
      1. Publishing metrics
      2. Posting events
    5. Best practices
    6. Summary
  18. Chapter 11: Integrating with Datadog
    1. Technical requirements
    2. Using client libraries
      1. REST API-based client libraries
      2. DogStatsD client libraries
    3. Evaluating community projects
      1. dog-watcher by Brightcove
      2. kennel
      3. Managing monitors using Terraform
      4. Ansible modules and integration
    4. Developing integrations
      1. Prerequisites
      2. Setting up the tooling
      3. Creating an integration folder
      4. Running tests
      5. Building a configuration file
      6. Building a package
      7. Deploying an integration
    5. Best practices
    6. Summary
  19. Section 3: Advanced Monitoring
  20. Chapter 12: Monitoring Containers
    1. Technical requirements
    2. Collecting Docker logs
    3. Monitoring Kubernetes
      1. Installing the Datadog Agent
    4. Using Live Containers
    5. Viewing logs using Live Tail
    6. Searching container data
    7. Best practices
    8. Summary
  21. Chapter 13: Managing Logs Using Datadog
    1. Technical requirements
    2. Collecting logs
      1. Collecting logs from public cloud services
      2. Shipping logs from containers
      3. Shipping logs from hosts
      4. Filtering logs
      5. Scrubbing sensitive data from logs
    3. Processing logs
    4. Archiving logs
    5. Searching logs
    6. Best practices
    7. Summary
  22. Chapter 14: Miscellaneous Monitoring Topics
    1. Technical requirements
    2. Application Performance Monitoring (APM)
      1. Sending traces to Datadog
      2. Profiling an application
      3. Service Map
    3. Implementing observability
    4. Synthetic monitoring
    5. Security monitoring
      1. Sourcing the logs
      2. Defining security rules
      3. Monitoring security signals
    6. Best practices
    7. Summary
    8. Why subscribe?
  23. Other Books You May Enjoy
    1. Packt is searching for authors like you
    2. Leave a review - let other readers know what you think

Product information

  • Title: Datadog Cloud Monitoring Quick Start Guide
  • Author(s): Thomas Kurian Theakanath
  • Release date: June 2021
  • Publisher(s): Packt Publishing
  • ISBN: 9781800568730