book

Monitoring with Graphite

by Jason Dixon

March 2017

Intermediate to advanced

290 pages

7h 30m

English

O'Reilly Media, Inc.

Read now

Unlock full access

Who Should Read This BookWhy I Wrote This BookA Word on Monitoring TodayNavigating This BookConventions Used in This BookO’Reilly SafariHow to Contact UsAcknowledgments
What Is Time-Series Data?Time-Series DatabasesStorage ConsiderationsPrioritizing WorkloadsWhat Is the History of Graphite?What Makes Graphite Unique?Simple Metrics FormatGraphing APIRapid PrototypingRich Statistical LibraryChained FunctionsCase Studies: Who Uses Graphite in Production?Booking.comGitHubEtsyElectronic ArtsWhy Should I Use Graphite?
Three Tenets of MonitoringFault DetectionAlertingCapacity PlanningRethinking the Poll/Pull ModelPull ModelPush ModelWhere Does Graphite Fit into the Picture?Composable Monitoring SystemsTelemetryMetrics RouterAggregationState EngineNotification RoutersStorage EngineVisualizationConclusion
Carboncarbon-cachecarbon-relaycarbon-aggregatorFiltering MetricsInternal StatisticsNetwork Security ConsiderationsWhisperHow Do Whisper Files Get Created?Retention Policies and ArchivesThe Laws of Whisper ArchivesCalculating Whisper File SizesDeconstructing a Whisper FileWhich Archive Handles My Query?Aggregation MethodsxFilesFactorPlanning Your NamespacesPerformance ConsiderationsGraphite-WebDjango FrameworkWeb ServerDatabaseMemcachedEventsStorage BackendsPutting It All TogetherBasic SetupVertical ScalingHorizontal ScalingMultisite ReplicationA Final Thought
Quick Start with SynthesizeWhere Does Graphite Store All My Files?Are Packages Available for My Distro?What Installation Methods Are Available?Should I Use Virtualenv?Using sudo EffectivelyDependenciesInstalling from SourcePreparing Your Web DatabaseConfiguring Carboncarbon.confstorage-schemas.confstorage-aggregation.confSome Final PreparationsStarting Your Carbon DaemonsConfiguring Graphite-Weblocal_settings.pySetting Up ApacheVerifying Your Graphite InstallationCarbon StatisticsFeeding New Data to CarbonBuilding Your First Graph
Finding MetricsNavigating the TreeUsing the Search FeatureWorking Smarter with the Auto-CompleterWildcardsThe Graphite Composer WindowThe Embedded ChartThe ToolbarSelecting Recent DataRefreshing the GraphSelecting a Date RangeExporting a Short URLLoading a Graph from URLSaving to My GraphsDeleting from My GraphsThe Graph Options MenuAdding a Graph TitleOverriding the Graph LegendToggling Axes and the GridApplying a Graph TemplateLine Chart ModesArea and Stacked GraphsTweaking the Y-AxisThe Graph Data DialogWhat Are Targets Anyway?Building a Carbon Performance GraphSharing Your Work
Working with FunctionsStarting with the BasicsMath and Statistical TransformsFiltering and SortingGrouping on WildcardsData Smoothing and ForecastingAdjusting Metric LabelsAlternate Output Formats
Why Do I Need a Dashboard?Graphite DashboardThird-Party DashboardsGrafanaTasseoDuskDo-It-YourselfDashingRickshaw and D3.jsConclusion
First, the BasicsThe Troubleshooting ToolbeltGenerating Metrics and BenchmarkingCPU UtilizationDisk Performance (I/O)NetworkingInspecting MetricsConfiguration SettingsCarbonGraphite-WebLoggingCarbonGraphite-WebKernel messagesFailure ScenariosThe Full DiskCPU SaturationRendering ProblemsTaking It to the Next Level
What Makes It “Hard” to Scale Graphite?Peter’s Graphite StoryThe BeginningThe Pains of PopularityClearing the Next HurdleTry, Try AgainMaximizing ResourcesAvoiding OutagesShared Web DatabaseScaling in Both DirectionsSome Final ThoughtsSummary

CarbonCommon to All Carbon typescarbon-cachecarbon-relaycarbon-aggregatorGraphite-Web

Content preview from Monitoring with Graphite

Chapter 8. Troubleshooting Graphite Performance

Success depends upon previous preparation, and without such preparation there is sure to be failure.

Confucius

People ask me about troubleshooting Graphite performance as if it’s some dark art practiced by only the most arcane operators in the bleakest corner of the largest data centers. Fortunately, nothing could be further from the truth.

Many of Graphite’s newer competitors attempt to hide their operational costs with “cluster in a box” designs, leaning heavily on NoSQL-type backends. Sadly, many of these projects are immature and have left a trail of data corruption and data loss in their wake. In many cases, the user is unprepared for the consequences, with little choice but to “blow it all away” and start over again.

By contrast, Graphite embraces traditional UNIX systems to store and retrieve time-series data. It may not be as sexy as the latest data store appearing on Hacker News, but the use cases, tooling, and failure scenarios are well documented and widely understood. The average system administrator understands files and filesystems, and how to repair them when something goes sideways. To paraphrase Dan McKinley, boring technology just gets the job done.

In this chapter, we’re going to build a foundation containing the skills and tools you’ll need to respond to just about any Graphite troubleshooting or scaling challenge. We’ll investigate what happens when you tweak a variety of performance-impacting configuration ...