book

Mastering Distributed Tracing

Name: Mastering Distributed Tracing
Author: Yuri Shkuro
ISBN: 9781788628464

by Yuri Shkuro

February 2019

Intermediate to advanced

444 pages

11h 36m

English

Packt Publishing

Read now

Unlock full access

Mastering Distributed Tracing
Table of Contents
Mastering Distributed Tracing
Why subscribe?
Packt.com
Contributors
About the author
About the reviewer
About the illustrator
Packt is searching for authors like you
Preface
Who this book is for
What this book covers

To get the most out of this book
Download the example code filesDownload the color imagesConventions used
Get in touch
Reviews
I. Introduction
1. Why Distributed Tracing?
Microservices and cloud-native applications
What is observability?
The observability challenge of microservices
Traditional monitoring tools
MetricsLogs
Distributed tracing
My experience with tracing
Why this book?
Summary
References
2. Take Tracing for a HotROD Ride
PrerequisitesRunning from prepackaged binariesRunning from Docker imagesRunning from the source codeGo language development environmentJaeger source code
Start Jaeger
Meet the HotROD
The architecture
The data flow
Contextualized logs
Span tags versus logs
Identifying sources of latency
Resource usage attribution
Summary
References
3. Distributed Tracing Fundamentals
The idea
Request correlation
Black-box inferenceSchema-basedMetadata propagation
Anatomy of distributed tracing
Sampling
Preserving causality
Inter-request causality
Trace models
Event modelSpan model
Clock skew adjustment
Trace analysis
Summary
References
II. Data Gathering Problem
4. Instrumentation Basics with OpenTracing
PrerequisitesProject source codeGo development environmentJava development environmentPython development environmentMySQL databaseQuery tools (curl or wget)Tracing backend (Jaeger)
OpenTracing
Exercise 1 – the Hello application
Hello application in GoHello application in JavaHello application in PythonExercise summary
Exercise 2 – the first trace
Step 1 – create a tracer instanceCreate a tracer in GoCreate a tracer in JavaCreate a tracer in PythonStep 2 – start a spanStart a span in GoStart a span in JavaStart a span in PythonStep 3 – annotate the spanAnnotate the span in GoAnnotate the span in JavaAnnotate the span in PythonExercise summary
Exercise 3 – tracing functions and passing context
Step 1 – trace individual functionsTrace individual functions in GoTrace individual functions in JavaTrace individual functions in PythonStep 2 – combine multiple spans into a single traceCombine multiple spans into a single trace in GoCombine multiple spans into a single trace in JavaCombine multiple spans into a single trace in PythonStep 3 – propagate the in-process contextIn-process context propagation in PythonIn-process context propagation in JavaIn-process context propagation in GoExercise summary
Exercise 4 – tracing RPC requests
Step 1 – break up the monolithMicroservices in GoMicroservices in JavaMicroservices in PythonStep 2 – pass the context between processesPassing context between processes in GoPassing context between processes in JavaPassing context between processes in PythonStep 3 – apply OpenTracing-recommended tagsStandard tags in GoStandard tags in JavaStandard tags in PythonExercise summary
Exercise 5 – using baggage
Using baggage in GoUsing baggage in JavaUsing baggage in PythonExercise summary
Exercise 6 – auto-instrumentation
Open source instrumentation in GoAuto-instrumentation in JavaAuto-instrumentation in Python
Exercise 7 – extra credit
Summary
References
5. Instrumentation of Asynchronous Applications
PrerequisitesProject source codeJava development environmentKafka, Zookeeper, Redis, and Jaeger
The Tracing Talk chat application
ImplementationThe lib moduleAppIdMessageKafkaConfig and KafkaServiceRedisConfig and RedisServiceGiphyServiceThe chat-api serviceThe storage-service microserviceThe giphy-service microserviceRunning the applicationObserving traces
Instrumenting with OpenTracing
Spring instrumentationTracer resolverRedis instrumentationKafka instrumentationProducing messagesConsuming messages
Instrumenting asynchronous code
Summary
References
6. Tracing Standards and Ecosystem
Styles of instrumentation
Anatomy of tracing deployment and interoperability
Five shades of tracing
Know your audience
The ecosystem
Tracing systemsZipkin and OpenZipkinJaegerSkyWalkingX-Ray, Stackdriver, and moreStandards projectsW3C Trace ContextW3C "Data Interchange Format"OpenCensusOpenTracing
Summary
References
7. Tracing with Service Meshes
Service meshes
Observability via a service mesh
Prerequisites
Project source codeJava development environmentKubernetesIstio
The Hello application
Distributed tracing with Istio
Using Istio to generate a service graph
Distributed context and routing
Summary
References
8. All About Sampling
Head-based consistent samplingProbabilistic samplingRate limiting samplingGuaranteed-throughput probabilistic samplingAdaptive samplingLocal adaptive samplingGlobal adaptive samplingGoalsTheoryArchitectureCalculating sampling probabilityImplications of adaptive samplingExtensionsContext-sensitive samplingAd-hoc or debug samplingHow to deal with oversamplingPost-collection down-samplingThrottling
Tail-based consistent sampling
Partial sampling
Summary
References
III. Getting Value from Tracing
9. Turning the Lights On
Tracing as a knowledge base
Service graphs
Deep, path-aware service graphsDetecting architectural problems
Performance analysis
Critical path analysisRecognizing trace patternsLook for error markersLook for the longest span on the critical pathLook out for missing detailsAvoid sequential execution or "staircase"Be wary when things finish at exactly the same timeExemplarsLatency histograms
Long-term profiling
Summary
References
10. Distributed Context Propagation
Brown Tracing Plane
Pivot tracing
Chaos engineering
Traffic labeling
Testing in productionDebugging in productionDeveloping in production
Summary
References
11. Integration with Metrics and Logs
Three pillars of observability
Prerequisites
Project source codeJava development environmentRunning the servers in DockerDeclaring index pattern in KibanaRunning the clients
The Hello application
Integration with metrics
Standard metrics via tracing instrumentationAdding context to metricsContext-aware metrics APIs
Integration with logs
Structured loggingCorrelating logs with trace contextContext-aware logging APIsCapturing logs in the tracing systemDo we need separate logging and tracing backends?
Summary
References
12. Gathering Insights with Data Mining
Feature extraction
Components of a data mining pipeline
Tracing backendTrace completion triggerFeature extractorAggregator
Feature extraction exercise
Prerequisites
Project source codeRunning the servers in DockerDefining index mapping in ElasticsearchJava development environmentMicroservices simulatorRunning as a Docker imageRunning from sourceVerifyDefine an index pattern in Kibana
The Span Count job
Trace completion triggerFeature extractor
Observing trends
Beware of extrapolations
Historical analysis
Ad hoc analysis
Summary
References
IV. Deploying and Operating Tracing Infrastructure
13. Implementing Tracing in Large Organizations
Why is it hard to deploy tracing instrumentation?
Reduce the barrier to adoption
Standard frameworksIn-house adapter librariesTracing enabled by defaultMonoreposIntegration with existing infrastructure
Where to start
Building the culture
Explaining the valueIntegrating with developer workflows
Tracing Quality Metrics
Troubleshooting guide
Don't be on the critical path
Summary
References
14. Under the Hood of a Distributed Tracing System
Why host your own?Customizations and integrationsBandwidth costOwn the data
Bet on emerging standards
Architecture and deployment modes
Basic architecture: agent + collector + query serviceClientAgentCollectorQuery service and UIData mining jobsStreaming architectureMulti-tenancyCost accountingComplete isolationGranular access controlsSecurityRunning in multiple DCsCapturing origin zoneCross-zone federation
Monitoring and troubleshooting
Resiliency
Over-samplingDebug tracesTraffic spikes due to DC failoverPerpetual tracesVery long traces
Summary
References
15. Afterword
References
Other Books You May Enjoy
Leave a review - let other readers know what you think
Index

Content preview from Mastering Distributed Tracing

Components of a data mining pipeline

There are probably many ways of building near real-time data mining for traces. In Canopy, the feature extraction functionality is built directly into the tracing backend, whereas in Jaeger, it can be done via post-processing add-ons, as we will do in this chapter's code exercise. Major components that are required are shown in Figure 12.1:

Tracing backend, or tracing infrastructure in general, collects tracing data from the microservices of the distributed application
Trace completion trigger makes a judgement call that all spans of the trace have been received and it is ready for processing
Feature extractor performs the actual calculations on each trace
An optional Aggregator combines features from individual ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

O’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.

Julian F.

Head of Cybersecurity

I wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.

Addison B.

Field Engineer

I’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.

Amir M.

Data Platform Tech Lead

I'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.

Mark W.

Embedded Software Engineer

Publisher Resources

ISBN: 9781788628464

Cloud Computing

Data Engineering

Data Science

AI & ML

Programming Languages

Software Architecture

IT/Ops

Security

Design

Business

Soft Skills

Mastering Distributed Tracing

by Yuri Shkuro

Components of a data mining pipeline

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

More than 5,000 organizations count on O’Reilly

Julian F.

Addison B.

Amir M.

Mark W.

You might also like

Get started with Distributed Tracing

Distributed Tracing in Practice

Distributed Services with Go

Logging in Action

Publisher Resources