Enabling Microservice Success

Book description

Microservices can be a very effective approach for delivering value to your organization and to your customers. If you get them right, microservices help you to move fast, making changes to small parts of your system hundreds of times a day. But get them wrong and microservices just make everything more complicated.

In this book, technical strategist Sarah Wells provides practical, in-depth advice for moving to microservices. Having built her first microservices architecture in 2013 for the Financial Times, Sarah discusses the approaches you need to take from the start, and explains the potential traps most likely to trip you up. You'll also learn how to maintain the architecture as your systems mature while minimizing the time you spend on support and maintenance.

With this book, you will:

  • Learn the impact of microservices on software development patterns and practices
  • Identify the organizational changes you need to make to successfully build and operate this architecture
  • Determine the steps you must take before you move to microservices
  • Understand the traps to avoid when you create a microservices architecture—and learn how to recover if you fall into one

Publisher resources

View/Submit Errata

Table of contents

  1. Preface
    1. Why I Wrote This Book
    2. Who Should Read This Book
    3. Navigating This Book
      1. Part I: Context
      2. Part II: Organizational Structure and Culture
      3. Part III: Building and Operating
      4. Appendices
    4. Conventions Used in This Book
    5. O’Reilly Online Learning
    6. How to Contact Us
    7. Acknowledgments
  2. I. Context
  3. 1. Understanding Microservices
    1. Defining the Microservice Architectural Style
      1. A Suite of Services
      2. Each Running In Its Own Process
      3. Communicating With Lightweight Mechanisms
      4. Built Around Business Capabilities
      5. Independently Deployable
      6. “Small”
      7. With a Bare Minimum of Centralized Management
      8. Heterogeneous
    2. Forerunners and Alternatives
      1. The Monolith
      2. Modular Monoliths
      3. Service-Oriented Architecture
    3. The Microservices Ecosystem
      1. Infrastructure as Code
      2. Continuous Delivery
      3. The Public Cloud
      4. New Deployment Options
      5. DevOps
      6. Observability
    4. Advantages of Microservices
      1. Independently Scalable
      2. Robust
      3. Easy to Release Small Changes Frequently
      4. Support Flexible Technology Choices
    5. Challenges of Microservices
      1. Latency
      2. Estate Complexity
      3. Operational Complexity
      4. Data Consistency
      5. Security
      6. Finding the Right Level of Granularity
      7. Handling Change
      8. Require Organizational Change
      9. Change the Developer Experience
    6. In Summary
  4. 2. Effective Software Delivery
    1. Regularly Delivering Business Value
      1. High Deployment Frequency
      2. Short Lead Time for Changes
      3. Running Experiments
      4. Separating Deploying Code from Releasing Functionality
      5. Handling Work That Goes Across Team Boundaries
    2. Adapting to Changing Priorities
    3. Maintaining Appropriate Service Levels
      1. When a Release Goes Wrong
      2. Knowing When Something Important Is Broken
      3. Restore Some Level of Service Quickly
      4. Avoid Failure Cascades
    4. Spending Most of Your Time on Meaningful Work
    5. Not Having to Start Again
    6. Keeping Risk at an Acceptable Level
    7. How Microservices Measure Up
    8. In Summary
  5. 3. Are Microservices Right for You?
    1. Reasons to Choose Microservices
      1. Scaling the Organization
      2. Developer Experience
      3. Separating Out Areas with Compliance and Security Requirements
      4. Scaling For Load
      5. Increasing Robustness
      6. Increasing Flexibility
    2. Conditions for Success
      1. Domain Understanding
      2. Products Not Projects
      3. Leadership Support
      4. Teams That Want Autonomy
      5. Processes That Enable Autonomy
      6. Technical Maturity
    3. Managing Change
    4. Sticking with a Monolithic Architecture
      1. Enable Zero-Downtime Deployments
      2. Build a Modular Monolith
    5. Everything is Distributed Now
      1. The Rise of Cloud Native
      2. SaaS Makes Sense
    6. Recommendations
      1. Starting from Scratch
      2. Replacing an Existing Monolith
      3. Measuring Success
    7. In Summary
  6. II. Organizational Structure and Culture
  7. 4. Conway’s Law and Finding the Right Boundaries
    1. Conway’s Law
      1. The Inverse Conway Maneuver
    2. Possible Boundaries
      1. Business Domains
      2. Locations
      3. Technologies
      4. Compliance
      5. Tolerance for Failure
      6. Frequency of Changes
      7. Recommendations
    3. Identifying When Boundaries are Wrong
    4. Case Study: Finding Our Boundaries at The FT
    5. In Summary
  8. 5. Building Effective Teams
    1. Organizational Culture
      1. Open
      2. Learning
      3. Empowering
      4. Optimized for Change
      5. The Westrum Model
    2. Effective Teams
      1. Motivated through Autonomy, Mastery and Purpose
      2. Aligned to Business Domain
      3. Appropriately Sized
      4. Cross Functional and T-shaped
      5. Strong Ownership
      6. Long Lived
      7. Sustainable Cognitive Load
      8. High Trust and High Psychological Safety
      9. Part of a Group
    3. Optimizing for Flow
      1. Stream-Aligned
      2. Enabling
      3. Complicated Subsystem
      4. Platform
    4. Case Study: The Evolution of the Organizational Structure at The FT
      1. Changing the Organizational Culture
      2. Changing Team Setup
      3. Key Changes To Enable Microservices
    5. In Summary
  9. 6. Enabling Autonomy
    1. What Is Autonomy?
    2. Why Does Autonomy Matter?
      1. Limits to Autonomy
    3. The Right Amount of Communication
    4. Interaction Styles
      1. Collaboration
      2. X-as-a-Service
      3. Facilitating
    5. Ways of Working that Support Autonomy
      1. Aligning on Outcomes
      2. Light Touch Governance
      3. Trust But Verify
      4. Agreeing and Aligning on Technology
      5. The Role of the Individual Contributor
      6. Minimum Viable Competencies
      7. Making Space For Learning
    6. Responsibilities of Autonomous Teams
      1. Active Ownership
      2. Communication and Cooperation
      3. Compliance with Standards
      4. Maintaining a Team Page
    7. In Summary
  10. 7. Engineering Enablement and Paving the Road
    1. What’s in a Name?
    2. Building a Platform
      1. Platform Services
      2. Organization-Level Concerns
      3. Building the Thinnest Viable Platform
      4. Build For the Needs of the Majority
      5. Platform as a Product
    3. Beyond the Platform
      1. Vendor Engineering
      2. APIs, Templates, Libraries, and Examples
      3. A Service Catalog
      4. Insights
    4. Paving the Road
      1. What Capabilities to Include
      2. Make it Optional
      3. Keep it Small
      4. How to Go Off-Road
      5. Bringing the Treasure Back
      6. Internal Developer Portals
    5. Building a Platform People Actually Use
      1. Making Sure What You Build Meets a Need
      2. Market It
      3. Look for Signs You Are Getting it Wrong
    6. Principles for Building a Paved Road
      1. Optional
      2. Provides Value
      3. Self-Service
      4. Owned & Supported
      5. Easy to Use
      6. Guides People to Do the Right Thing
      7. Composable and Extendable
    7. Measuring Impact
    8. When to Invest in Engineering Enablement
    9. In Summary
  11. 8. Ensuring “You Build It, You Run It”
    1. Why Microservices Implies DevOps
      1. Release on Demand
      2. Work on Operational Features
    2. Building Things Differently
      1. Good Runbooks
      2. Running on Someone Else’s Servers
      3. Getting Comfortable in Production
    3. Supporting Your System in Production
      1. Assign Dedicated In-Hours Ops Support
      2. Improve Alerts and Documentation
      3. Identify the Haunted Forests
      4. Practice
    4. Out-of-Hours Support
      1. Allow People to Opt Out
      2. Formal Rotas vs Best Endeavors
      3. Make Sure Calls Are Rare
      4. Only For Critical Systems
      5. Provide Support and Guidance
    5. Incident Management
      1. Blameless Culture
      2. Raising An Incident
      3. Roles to Assign
      4. During the Incident
      5. After the Incident
      6. Learning From Incidents
    6. In Summary
  12. III. Building and Operating
  13. 9. Active Service Ownership
    1. Responding to the Log4Shell Vulnerability
      1. A Counter Example: Equifax and a Struts Vulnerability
    2. Ownership During Active Development
      1. Strong Ownership
      2. Weak Ownership
      3. Collective Ownership
    3. Once a Service is Feature Complete
      1. No Ownership
      2. Nominal Ownership
      3. Active Ownership
    4. What Active Ownership Means
      1. Code Stewardship
      2. Upgrades and Patching
      3. Migrations
      4. Production Support
      5. Documentation
    5. Knowing Your Estate
      1. Your Own Software
      2. Dependencies
      3. Third-Party Software
    6. What You Need From A Service Registry
      1. Graph-Based Model
      2. API-Driven
      3. Extensible
      4. Flexible Schema
      5. Provides Different Views Across the Estate
    7. Transferring Ownership
      1. What Does a Good Transfer Look Like?
      2. Meeting Quality Expectations
      3. Operational Handover
      4. Replacing
    8. What to Do If You’re Struggling
      1. Make the Business Case
      2. Start with Critical Systems
      3. Make Your Best Guess at Owners
      4. Deliver Value from the Data
      5. Aim for Continuous Improvement
      6. Look For Teams That Are Overwhelmed
      7. Services Shouldn’t Live Forever
    9. In Summary
  14. 10. Getting Value from Testing
    1. Why Do We Test?
      1. Building the Thing Right
      2. Building the Right Thing
      3. Picking Up Regressions
      4. Meeting Quality-of-Service Requirements
    2. Shifting Testing Left
    3. What Makes a Good Test?
      1. Fast and Early Feedback
      2. Easy to Change
      3. Finds Real Problems
    4. Types of Testing
      1. The Testing Pyramid
      2. Unit Tests
      3. Service Tests
      4. End-to-End Tests
      5. Contract Tests
      6. Consistency Tests
      7. Exploratory Tests
      8. Cross-Functional Testing
    5. Testing in Production
      1. Is It Safe?
      2. Staging Is Not Production-Like
      3. Your Customers Can Surprise You
      4. You Can’t Test for Every Variation
      5. You Don’t Have to Roll a Change Out to Everyone
      6. Monitoring as Testing
      7. Coherence Testing
    6. Testing Your Infrastructure
      1. Chaos Engineering
      2. Testing Failovers and Restores
    7. Quality is About More Than Testing
    8. What to Do If You’re Struggling
      1. Not Enough Automated Testing
      2. Tests That Aren’t Providing Value
    9. In Summary
  15. 11. Governance and Standardization: Finding the Balance
    1. Why Governance Matters
    2. Know Your Estate
      1. What Sort of Information is Relevant?
    3. Guardrails and Policies
      1. Automating Guardrails
      2. What To Include
      3. The FT’s Guardrails
    4. Aligning on Guardrails
      1. Tech Governance Group
      2. Benefits of the TGG
    5. Choosing Technologies
      1. The Technology Lifecycle
      2. Save Innovation for Key Business Outcomes
      3. Use Boring Technology
      4. Limit the Alternatives
      5. Be Clear on Where Duplication is Acceptable
      6. Expect Things to Change
    6. Insight Leads to Action
    7. Governance in Other Organizations
      1. Governance at Monzo
      2. Governance at Skyscanner
    8. What to Do If You’re Struggling
    9. In Summary
  16. 12. Building Resilience In
    1. What is Resilience?
      1. Resilience for Distributed Systems
      2. Resilience for Microservices
    2. Understanding Your Service Level Requirements
      1. Service Level Objectives
      2. Error Budgets
    3. Building Resilient Services
      1. Redundancy
      2. Fast Startup and Graceful Shutdown
      3. Set Appropriate Timeouts
      4. Back Off and Retry
      5. Make Your Requests Idempotent
      6. Protect Yourself
      7. Testing Service Resilience
      8. Make Building Resilient Services Easy
    4. Building Resilient Systems
      1. Caching
      2. Handling Cascading Failures
      3. Fallback Behavior
      4. Avoiding Unnecessary Work
      5. Go Asynchronous
      6. Failover
      7. Backup and Restore
      8. Disaster Recovery
    5. Building a Resilient Platform
      1. Resilience to External Issues
      2. Internal Tooling
    6. Validating Your Resilience Choices
      1. Chaos Engineering
      2. Testing Backup and Restore
      3. Practice Makes Perfect
      4. Load Testing
      5. Learn from Incidents
      6. One Thing At A Time
    7. What to Do If You’re Struggling
    8. In Summary
  17. 13. Running Your System In Production
    1. Operational Challenges of Microservices
      1. Different Technologies Mean Different Support Knowledge Is Needed
      2. Ephemeral Infrastructure
      3. Rapid Change
      4. Alert Overload
      5. Complex Systems Run In Degraded Mode
    2. Building Observability In
      1. Logging
      2. Monitoring and Metrics
      3. Log Aggregation
      4. OpenTelemetry
      5. Focus on Events
      6. Distributed Tracing
      7. Archiving Observability Data
    3. Building Your Own Tools
    4. Spotting Issues
      1. Getting Alerting Right
      2. Healthchecks
      3. Monitoring Business Outcomes
      4. Understanding What Normal Looks Like
    5. Mitigation
    6. Troubleshooting
      1. Maintaining Useful Documentation
      2. Knowing What’s Changed
      3. Problems With External Systems
      4. Tooling Characteristics
    7. Learning From Incidents
    8. What to Do If You’re Struggling
    9. In Summary
  18. 14. Keeping Things Up to Date
    1. Why Is This a Challenge?
    2. Minimizing the Impact of Change
      1. Think About the Long Term
      2. A Reason to Be On the Paved Road
      3. Choose Managed Services and SaaS Options
      4. Provide APIs
      5. Immutable and Ephemeral Infrastructure
      6. Decommission and Deprecate
    3. Types of Change
      1. Emergency Changes
      2. Minor Planned Changes
      3. Major Planned Changes
    4. Responding to Change
      1. Understand the Landscape
      2. Define Guiding Policies
    5. Making a Decision
      1. Who Gets to Decide?
      2. Scheduling Work
    6. Managing Change
      1. Clarity
      2. Communication
      3. Empathy
      4. Execution
    7. What to Do If You’re Struggling
    8. In Summary
  19. Afterword
    1. Why Microservices?
      1. The Importance of Flow
      2. Support for Autonomy
      3. The Rise of Platform Engineering
      4. Wrapping Up
  20. A. Microservices Assessment
    1. Do You Need Microservices?
      1. Scaling Challenges
      2. Technical Reasons
    2. Spotting Potential Pitfalls
      1. Organizational Structure and Culture
      2. Software Delivery Approach
  21. B. Recommended Reading
    1. Part 1: Context
    2. Part 2: Organizational Structure and Culture
    3. Part 3: Building and Operating
  22. Index
  23. About the Author

Product information

  • Title: Enabling Microservice Success
  • Author(s): Sarah Wells
  • Release date: April 2024
  • Publisher(s): O'Reilly Media, Inc.
  • ISBN: 9781098130794