Chapter 8. Monitoring Performance: Challenges and Solutions

Introduction

System design and tuning aren’t the only aspects of multi-tenant distributed systems that require different treatment than traditional single-node systems or data centers of machines working independently. Monitoring (detection and diagnosis of problems) is also fundamentally different for distributed systems, especially multi-tenant systems for which the nature of the workload can change dramatically over time.

Traditional system administration makes use of a variety of tools for understanding the performance of and debugging problems on a single node, such as the following in Linux:

top: Displays a regularly updated page of current information about hardware use, both for the node as a whole and per-process, focusing on CPU and memory usage.
iotop: Similar to top but reports on disk I/O.
iostat: Generates a report on CPU statistics and input/output statistics for devices, partitions, and network file systems.
ss and ip: Reports on network information for a node, such as sockets, connections, routing, and devices.
sar: Regularly collects and reports on a wide variety of system metrics for the node overall.
The /proc file system: A virtual file system that provides a convenient and structured way to access process data stored in the kernel’s internal data structures.

These tools, along with log files from a machine, are generally used after an operator has identified a particular machine as having slow ...

Get Effective Multi-Tenant Distributed Systems now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Effective Multi-Tenant Distributed Systems by Chad Carson, Sean Suchter

Chapter 8. Monitoring Performance: Challenges and Solutions

Introduction

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly