book

Architecting for Scale

Name: Architecting for Scale
Author: Lee Atchison
ISBN: 9781491943397

by Lee Atchison

July 2016

Intermediate to advanced

228 pages

5h 1m

English

O'Reilly Media, Inc.

Read now

Unlock full access

Foreword
Preface
Who Should Read This BookWhy I Wrote This BookA Word on Scale TodayNavigating This BookPart I, “Availability”Part II, “Risk Management”Part III, “Services and Microservices”Part IV, “Scaling Applications”Part V, “Cloud Services”Part VI, “Conclusion”Online ResourcesConventions Used in This BookSafari® Books OnlineHow to Contact UsAcknowledgments
I. Availability
1. What Is Availability?
Availability Versus ReliabilityWhat Causes Poor Availability?
2. Five Focuses to Improve Application Availability
Focus #1: Build with Failure in MindFocus #2: Always Think About ScalingFocus #3: Mitigate RiskFocus #4: Monitor AvailabilityFocus #5: Respond to Availability Issues in a Predictable and Defined WayBeing Prepared
3. Measuring Availability
The NinesWhat’s Reasonable?Don’t Be FooledAvailability by the Numbers
4. Improving Your Availability When It Slips
Measure and Track Your Current AvailabilityAutomate Your Manual ProcessesAutomated DeploysConfiguration ManagementChange Experiments and High Frequency ChangesAutomated Change Sanity TestingImprove Your SystemsYour Changing and Growing ApplicationKeeping on Top of Availability
II. Risk Management
5. What Is Risk Management?
Managing RiskIdentify RiskRemove Worst OffendersMitigateReview RegularlyManaging Risk Summary
6. Likelihood Versus Severity
The Top 10 List: Low Likelihood, Low Severity RiskThe Order Database: Low Likelihood, High Severity RiskCustom Fonts: High Likelihood, Low Severity RiskT-Shirt Photos: High Likelihood, High Severity Risk

7. The Risk Matrix
Scope of the Risk MatrixCreating the Risk MatrixBrainstorming the ListSet the Likelihood and Severity FieldsRisk Item DetailsMitigation PlanTriggered PlanUsing the Risk Matrix for PlanningMaintaining the Risk Matrix
8. Risk Mitigation
Recovery PlansDisaster Recovery PlansImproving Our Risk Situation
9. Game Days
Staging Versus Production EnvironmentsConcerns with Running Game Days in ProductionGame Day Testing
10. Building Systems with Reduced Risk
RedundancyExamples of Idempotent InterfacesRedundancy Improvements That Increase ComplexityIndependenceSecuritySimplicitySelf-RepairOperational Processes
III. Services and Microservices
11. Why Use Services?
The Monolith ApplicationThe Service-Based ApplicationThe Ownership BenefitThe Scaling Benefit
12. Using Microservices
What Should Be a Service?Dividing into ServicesGuideline #1: Specific Business RequirementsGuideline #2: Distinct and Separable Team OwnershipGuideline #3: Naturally Separable DataGuideline #4: Shared Capabilities/DataMixed ReasonsGoing Too FarThe Right Balance
13. Dealing with Service Failures
Cascading Service FailuresResponding to a Service FailurePredictable ResponseUnderstandable ResponseReasonable ResponseDetermining FailuresAppropriate ActionGraceful DegradationGraceful BackoffFail as Early as PossibleCustomer-Caused Problems
IV. Scaling Applications
14. Two Mistakes High
What Is “Two Mistakes High”?“Two Mistakes High” in PracticeLosing a NodeProblems During UpgradesData Center ResiliencyHidden Shared Failure TypesFailure LoopsManaging Your ApplicationsThe Space Shuttle
15. Service Ownership
Single Team Owned Service ArchitectureAdvantages of a STOSA Application and OrganizationWhat Does it Mean to Be a Service Owner?
16. Service Tiers
Application ComplexityWhat Are Service Tiers?Assigning Service Tier Labels to ServicesTier 1Tier 2Tier 3Tier 4Example: Online StoreWhat’s Next?
17. Using Service Tiers
ExpectationsResponsivenessDependenciesCritical DependencyNoncritical DependencySummary
18. Service-Level Agreements
What are Service-Level Agreements?External Versus Internal SLAsWhy Are Internal SLAs Important?SLAs as TrustSLAs for Problem DiagnosisPerformance Measurements for SLAsLimit SLAsTop Percentile SLAsLatency GroupsHow Many and Which Internal SLAs?Additional Comments on SLAs
19. Continuous Improvement
Examine Your Application RegularlyMicroservicesService OwnershipStateless ServicesWhere’s the Data?Data PartitioningThe Importance of Continuous Improvement
V. Cloud Services
20. Change and the Cloud
What Has Changed in the Cloud?Acceptance of Microservice-Based ArchitecturesSmaller, More Specialized ServicesGreater Focus on the ApplicationThe Micro StartupSecurity and Compliance Has MaturedChange Continues
21. Distributing the Cloud
AWS ArchitectureAWS RegionAWS Availability ZoneData CenterArchitecture OverviewAvailability Zones Are Not Data CentersMaintaining Location Diversity for Availability Reasons
22. Managed Infrastructure
Structure of Cloud-Based ServicesRaw ResourceManaged Resource (Server-Based)Managed Resource (Non-server-based)Implications of Using Managed ResourcesImplications of Using Non-Managed ResourcesMonitoring and CloudWatch
23. Cloud Resource Allocation
Allocated-Capacity Resource AllocationChanging AllocationsReserved CapacityUsage-Based Resource AllocationThe “Magic” of Usage-Based Resource AllocationThe Pros and Cons of Resource Allocation Techniques
24. Scalable Computing Options
Cloud-Based ServersAdvantagesDisadvantagesOptimized Use CasesCompute SlicesAdvantagesDisadvantagesOptimized Use CasesDynamic ContainersAdvantagesDisadvantagesOptimized Use CasesMicrocomputeAdvantagesDisadvantagesOptimized Use CasesNow What?
25. AWS Lambda
Using LambdaEvent ProcessingMobile BackendInternet of Things Data IntakeAdvantages and Disadvantages of Lambda
VI. Conclusion
26. Putting It All Together
AvailabilityRisk ManagementServicesScalingCloudArchitecting for Scale
Index

Overview

Every day, companies struggle to scale critical applications. As traffic volume and data demands increase, these applications become more complicated and brittle, exposing risks and compromising availability. This practical guide shows IT, devops, and system reliability managers how to prevent an application from becoming slow, inconsistent, or downright unavailable as it grows.

Scaling isn’t just about handling more users; it’s also about managing risk and ensuring availability. Author Lee Atchison provides basic techniques for building applications that can handle huge quantities of traffic, data, and demand without affecting the quality your customers expect.

In five parts, this book explores:

Availability: learn techniques for building highly available applications, and for tracking and improving availability going forward
Risk management: identify, mitigate, and manage risks in your application, test your recovery/disaster plans, and build out systems that contain fewer risks
Services and microservices: understand the value of services for building complicated applications that need to operate at higher scale
Scaling applications: assign services to specific teams, label the criticalness of each service, and devise failure scenarios and recovery plans
Cloud services: understand the structure of cloud-based services, resource allocation, and service distribution

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

O’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.

Julian F.

Head of Cybersecurity

I wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.

Addison B.

Field Engineer

I’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.

Amir M.

Data Platform Tech Lead

I'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.

Mark W.

Embedded Software Engineer

Publisher Resources

ISBN: 9781491943380Errata Page

Cloud Computing

Data Engineering

Data Science

AI & ML

Programming Languages

Software Architecture

IT/Ops

Security

Design

Business

Soft Skills