Chapter 9. Scaling Graphite

A quick search of Google for the phrase “how to scale Graphite” returns over 21 million results. I’d love to tell you that you’ll be welcomed with an endless stream of blog posts and technical articles describing the “three-step process” for endless scalability of Graphite’s components. In reality, only a small handful of these are comprehensive and direct enough to be worth your consideration (especially those from Jamie Alquiza).

What Makes It “Hard” to Scale Graphite?

Why do so many people find it challenging to grow their Graphite system from a modest trickle of metrics to a full-blown metrics storage cluster capable of receiving millions of datapoints per second? Circumstances differ for everyone, but I think it boils down to three major points.

First, (in terms of scaling) Graphite administration requires a level of systems knowledge as it pertains to I/O and CPU performance. It helps to have some experience managing Linux/UNIX systems: diagnosing disk and filesystem performance, userspace processes, etc. All of the material we covered in the previous chapter won’t do you any good without at least a basic understanding of systems administration.

Second, Graphite evolved from a simple round-robin time-series database system (RRD-based) to a complex round-robin time-series database system (built-in caching, horizontally scalable, discrete components). It grew organically over time to meet the adapting needs of first its corporate founder Orbitz ...

Get Monitoring with Graphite now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.