book

SRE with Java Microservices

Name: SRE with Java Microservices
Author: Jonathan Schneider
ISBN: 9781492073925

by Jonathan Schneider

September 2020

Intermediate to advanced

314 pages

8h 22m

English

O'Reilly Media, Inc.

Read now

Unlock full access

Foreword
Preface
My JourneyConventions Used in This BookO’Reilly Online LearningHow to Contact UsAcknowledgments
1. The Application Platform
Platform Engineering CultureMonitoringMonitoring for AvailabilityMonitoring as a Debugging ToolLearning to Expect FailureEffective Monitoring Builds TrustDeliveryTraffic ManagementCapabilities Not CoveredTesting AutomationChaos Engineering and Continuous VerificationConfiguration as CodeEncapsulating CapabilitiesService MeshSummary
2. Application Metrics
Black Box Versus White Box MonitoringDimensional MetricsHierarchical MetricsMicrometer Meter RegistriesCreating MetersNaming MetricsCommon TagsClasses of MetersGaugesCountersTimers“Count” Means “Throughput”“Count” and “Sum” Together Mean “Aggregable Average”Maximum Is a Decaying Signal That Isn’t Aligned to the Push IntervalThe Sum of Sum Over an IntervalThe Base Unit of TimeUsing TimersCommon Features of Latency DistributionsPercentiles/QuantilesHistogramsService Level Objective BoundariesDistribution SummariesLong Task TimersChoosing the Right Meter TypeControlling CostCoordinated OmissionLoad TestingMeter FiltersDeny/Accept MetersTransforming MetricsConfiguring Distribution StatisticsSeparating Platform and Application MetricsPartitioning Metrics by Monitoring SystemMeter BindersSummary
3. Debugging with Observability
The Three Pillars of Observability…or Is It Two?LogsDistributed TracingMetricsWhich Telemetry Is Appropriate?Components of a Distributed TraceTypes of Distributed Tracing InstrumentationManual TracingAgent TracingFramework TracingService Mesh TracingBlended TracingSamplingNo SamplingRate-Limiting SamplersProbabilistic SamplersBoundary SamplingImpact of Sampling on Anomaly DetectionDistributed Tracing and MonolithsCorrelation of TelemetryMetric to Trace CorrelationUsing Trace Context for Failure Injection and ExperimentationSummary
4. Charting and Alerting
Differences in Monitoring SystemsEffective Visualizations of Service Level IndicatorsStyles for Line Width and ShadingErrors Versus Successes“Top k” VisualizationsPrometheus Rate Interval SelectionGaugesCountersTimersWhen to Stop Creating DashboardsService Level Indicators for Every Java MicroserviceErrorsLatencyGarbage Collection Pause TimesHeap UtilizationCPU UtilizationFile DescriptorsSuspicious TrafficBatch Runs or Other Long-Running TasksBuilding Alerts Using Forecasting MethodsNaive MethodSingle-Exponential SmoothingUniversal Scalability LawSummary
5. Safe, Multicloud Continuous Delivery
Types of PlatformsResource TypesDelivery PipelinesPackaging for the CloudPackaging for IaaS PlatformsPackaging for Container SchedulersThe Delete + None DeploymentThe HighlanderBlue/Green DeploymentAutomated Canary AnalysisSpinnaker with KayentaGeneral-Purpose Canary Metrics for Every MicroserviceSummary
6. Source Code Observability
The Stateful Asset InventoryRelease VersioningMaven RepositoriesBuild Tools for Release VersioningCapturing Resolved Dependencies in MetadataCapturing Method-Level Utilization of the Source CodeStructured Code Search with OpenRewriteDependency ManagementVersion MisalignmentsDynamic Version ConstraintsUnused DependenciesUndeclared Explicitly Used DependenciesSummary
7. Traffic Management
Microservices Offer More Potential Failure PointsConcurrency of SystemsPlatform Load BalancingGateway Load BalancingJoin the Shortest QueueInstance-Reported Availability and UtilizationHealth ChecksChoice of TwoInstance ProbationKnock-On Effects of Smarter Load BalancingClient-Side Load BalancingHedge RequestsCall Resiliency PatternsRetriesRate LimitersBulkheadsCircuit BreakersAdaptive Concurrency LimitsChoosing the Right Call Resiliency PatternImplementation in Service MeshImplementation in RSocketSummary
Index

Content preview from SRE with Java Microservices

Chapter 2. Application Metrics

The complexity of distributed systems comprised of many communicating microservices means it is especially important to be able to observe the state of the system. The rate of change is high, including new code releases, independent scaling events with changing load, changes to infrastructure (cloud provider changes), and dynamic configuration changes propagating through the system. In this chapter, we will focus on how to measure and alert on the performance of the distributed system and some industry best practices to adopt.

An organization must commit at a minimum to one or more monitoring solutions. There are a wide range of choices including open source, commercial on-premises, and SaaS offerings with a broad spectrum of capabilities. The market is mature enough that an organization of any size and complexity can find a solution that fits its requirements.

The choice of monitoring system is important to preserve the fixed-cost characteristic of metrics data. The StatsD protocol, for example, requires an emission to a StatsD agent from an application on a per-event basis. Even if this agent is running as a sidecar process on the same host, the application still suffers the allocation cost of creating the payload on a per-event basis, so this protocol breaks at least this advantage of metrics telemetry. This isn’t always (or even commonly) catastrophic, but be aware of this cost.

Black Box Versus White Box Monitoring

Approaches to metrics collection ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

O’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.

Julian F.

Head of Cybersecurity

I wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.

Addison B.

Field Engineer

I’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.

Amir M.

Data Platform Tech Lead

I'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.

Mark W.

Embedded Software Engineer

Publisher Resources

ISBN: 9781492073918Errata Page

Cloud Computing

Data Engineering

Data Science

AI & ML

Programming Languages

Software Architecture

IT/Ops

Security

Design

Business

Soft Skills

SRE with Java Microservices

by Jonathan Schneider

Chapter 2. Application Metrics

Black Box Versus White Box Monitoring

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.