Software Engineering at Google

Book description

Today, software engineers need to know not only how to program effectively but also how to develop proper engineering practices to make their codebase sustainable and healthy. This book emphasizes this difference between programming and software engineering.

How can software engineers manage a living codebase that evolves and responds to changing requirements and demands over the length of its life? Based on their experience at Google, software engineers Titus Winters and Hyrum Wright, along with technical writer Tom Manshreck, present a candid and insightful look at how some of the world’s leading practitioners construct and maintain software. This book covers Google’s unique engineering culture, processes, and tools and how these aspects contribute to the effectiveness of an engineering organization.

You’ll explore three fundamental principles that software organizations should keep in mind when designing, architecting, writing, and maintaining code:

  • How time affects the sustainability of software and how to make your code resilient over time
  • How scale affects the viability of software practices within an engineering organization
  • What trade-offs a typical engineer needs to make when evaluating design and development decisions

Publisher resources

View/Submit Errata

Table of contents

  1. Foreword
  2. Preface
    1. Programming Over Time
    2. Google’s Perspective
    3. What This Book Isn’t
    4. Parting Remarks
    5. Conventions Used in This Book
    6. O’Reilly Online Learning
    7. How to Contact Us
    8. Acknowledgments
  3. I. Thesis
  4. 1. What Is Software Engineering?
    1. Time and Change
      1. Hyrum’s Law
      2. Example: Hash Ordering
      3. Why Not Just Aim for “Nothing Changes”?
    2. Scale and Efficiency
      1. Policies That Don’t Scale
      2. Policies That Scale Well
      3. Example: Compiler Upgrade
      4. Shifting Left
    3. Trade-offs and Costs
      1. Example: Markers
      2. Inputs to Decision Making
      3. Example: Distributed Builds
      4. Example: Deciding Between Time and Scale
      5. Revisiting Decisions, Making Mistakes
    4. Software Engineering Versus Programming
    5. Conclusion
    6. TL;DRs
  5. II. Culture
  6. 2. How to Work Well on Teams
    1. Help Me Hide My Code
    2. The Genius Myth
    3. Hiding Considered Harmful
      1. Early Detection
      2. The Bus Factor
      3. Pace of Progress
      4. In Short, Don’t Hide
    4. It’s All About the Team
      1. The Three Pillars of Social Interaction
      2. Why Do These Pillars Matter?
      3. Humility, Respect, and Trust in Practice
      4. Blameless Post-Mortem Culture
      5. Being Googley
    5. Conclusion
    6. TL;DRs
  7. 3. Knowledge Sharing
    1. Challenges to Learning
    2. Philosophy
    3. Setting the Stage: Psychological Safety
      1. Mentorship
      2. Psychological Safety in Large Groups
    4. Growing Your Knowledge
      1. Ask Questions
      2. Understand Context
    5. Scaling Your Questions: Ask the Community
      1. Group Chats
      2. Mailing Lists
      3. YAQS: Question-and-Answer Platform
    6. Scaling Your Knowledge: You Always Have Something to Teach
      1. Office Hours
      2. Tech Talks and Classes
      3. Documentation
      4. Code
    7. Scaling Your Organization’s Knowledge
      1. Cultivating a Knowledge-Sharing Culture
      2. Establishing Canonical Sources of Information
      3. Staying in the Loop
    8. Readability: Standardized Mentorship Through Code Review
      1. What Is the Readability Process?
      2. Why Have This Process?
    9. Conclusion
    10. TL;DRs
  8. 4. Engineering for Equity
    1. Bias Is the Default
    2. Understanding the Need for Diversity
    3. Building Multicultural Capacity
    4. Making Diversity Actionable
    5. Reject Singular Approaches
    6. Challenge Established Processes
    7. Values Versus Outcomes
    8. Stay Curious, Push Forward
    9. Conclusion
    10. TL;DRs
  9. 5. How to Lead a Team
    1. Managers and Tech Leads (and Both)
      1. The Engineering Manager
      2. The Tech Lead
      3. The Tech Lead Manager
    2. Moving from an Individual Contributor Role to a Leadership Role
      1. The Only Thing to Fear Is…Well, Everything
      2. Servant Leadership
    3. The Engineering Manager
      1. Manager Is a Four-Letter Word
      2. Today’s Engineering Manager
    4. Antipatterns
      1. Antipattern: Hire Pushovers
      2. Antipattern: Ignore Low Performers
      3. Antipattern: Ignore Human Issues
      4. Antipattern: Be Everyone’s Friend
      5. Antipattern: Compromise the Hiring Bar
      6. Antipattern: Treat Your Team Like Children
    5. Positive Patterns
      1. Lose the Ego
      2. Be a Zen Master
      3. Be a Catalyst
      4. Remove Roadblocks
      5. Be a Teacher and a Mentor
      6. Set Clear Goals
      7. Be Honest
      8. Track Happiness
    6. The Unexpected Question
    7. Other Tips and Tricks
    8. People Are Like Plants
      1. Intrinsic Versus Extrinsic Motivation
    9. Conclusion
    10. TL;DRs
  10. 6. Leading at Scale
    1. Always Be Deciding
      1. The Parable of the Airplane
      2. Identify the Blinders
      3. Identify the Key Trade-Offs
      4. Decide, Then Iterate
    2. Always Be Leaving
      1. Your Mission: Build a “Self-Driving” Team
      2. Dividing the Problem Space
    3. Always Be Scaling
      1. The Cycle of Success
      2. Important Versus Urgent
      3. Learn to Drop Balls
      4. Protecting Your Energy
    4. Conclusion
    5. TL;DRs
  11. 7. Measuring Engineering Productivity
    1. Why Should We Measure Engineering Productivity?
    2. Triage: Is It Even Worth Measuring?
    3. Selecting Meaningful Metrics with Goals and Signals
    4. Goals
    5. Signals
    6. Metrics
    7. Using Data to Validate Metrics
    8. Taking Action and Tracking Results
    9. Conclusion
    10. TL;DRs
  12. III. Processes
  13. 8. Style Guides and Rules
    1. Why Have Rules?
    2. Creating the Rules
      1. Guiding Principles
      2. The Style Guide
    3. Changing the Rules
      1. The Process
      2. The Style Arbiters
      3. Exceptions
    4. Guidance
    5. Applying the Rules
      1. Error Checkers
      2. Code Formatters
    6. Conclusion
    7. TL;DRs
  14. 9. Code Review
    1. Code Review Flow
    2. How Code Review Works at Google
    3. Code Review Benefits
      1. Code Correctness
      2. Comprehension of Code
      3. Code Consistency
      4. Psychological and Cultural Benefits
      5. Knowledge Sharing
    4. Code Review Best Practices
      1. Be Polite and Professional
      2. Write Small Changes
      3. Write Good Change Descriptions
      4. Keep Reviewers to a Minimum
      5. Automate Where Possible
    5. Types of Code Reviews
      1. Greenfield Code Reviews
      2. Behavioral Changes, Improvements, and Optimizations
      3. Bug Fixes and Rollbacks
      4. Refactorings and Large-Scale Changes
    6. Conclusion
    7. TL;DRs
  15. 10. Documentation
    1. What Qualifies as Documentation?
    2. Why Is Documentation Needed?
    3. Documentation Is Like Code
    4. Know Your Audience
      1. Types of Audiences
    5. Documentation Types
      1. Reference Documentation
      2. Design Docs
      3. Tutorials
      4. Conceptual Documentation
      5. Landing Pages
    6. Documentation Reviews
    7. Documentation Philosophy
      1. WHO, WHAT, WHEN, WHERE, and WHY
      2. The Beginning, Middle, and End
      3. The Parameters of Good Documentation
      4. Deprecating Documents
    8. When Do You Need Technical Writers?
    9. Conclusion
    10. TL;DRs
  16. 11. Testing Overview
    1. Why Do We Write Tests?
      1. The Story of Google Web Server
      2. Testing at the Speed of Modern Development
      3. Write, Run, React
      4. Benefits of Testing Code
    2. Designing a Test Suite
      1. Test Size
      2. Test Scope
      3. The Beyoncé Rule
      4. A Note on Code Coverage
    3. Testing at Google Scale
      1. The Pitfalls of a Large Test Suite
    4. History of Testing at Google
      1. Orientation Classes
      2. Test Certified
      3. Testing on the Toilet
      4. Testing Culture Today
    5. The Limits of Automated Testing
    6. Conclusion
    7. TL;DRs
  17. 12. Unit Testing
    1. The Importance of Maintainability
    2. Preventing Brittle Tests
      1. Strive for Unchanging Tests
      2. Test via Public APIs
      3. Test State, Not Interactions
    3. Writing Clear Tests
      1. Make Your Tests Complete and Concise
      2. Test Behaviors, Not Methods
      3. Don’t Put Logic in Tests
      4. Write Clear Failure Messages
    4. Tests and Code Sharing: DAMP, Not DRY
      1. Shared Values
      2. Shared Setup
      3. Shared Helpers and Validation
      4. Defining Test Infrastructure
    5. Conclusion
    6. TL;DRs
  18. 13. Test Doubles
    1. The Impact of Test Doubles on Software Development
    2. Test Doubles at Google
    3. Basic Concepts
      1. An Example Test Double
      2. Seams
      3. Mocking Frameworks
    4. Techniques for Using Test Doubles
      1. Faking
      2. Stubbing
      3. Interaction Testing
    5. Real Implementations
      1. Prefer Realism Over Isolation
      2. How to Decide When to Use a Real Implementation
    6. Faking
      1. Why Are Fakes Important?
      2. When Should Fakes Be Written?
      3. The Fidelity of Fakes
      4. Fakes Should Be Tested
      5. What to Do If a Fake Is Not Available
    7. Stubbing
      1. The Dangers of Overusing Stubbing
      2. When Is Stubbing Appropriate?
    8. Interaction Testing
      1. Prefer State Testing Over Interaction Testing
      2. When Is Interaction Testing Appropriate?
      3. Best Practices for Interaction Testing
    9. Conclusion
    10. TL;DRs
  19. 14. Larger Testing
    1. What Are Larger Tests?
      1. Fidelity
      2. Common Gaps in Unit Tests
      3. Why Not Have Larger Tests?
    2. Larger Tests at Google
      1. Larger Tests and Time
      2. Larger Tests at Google Scale
    3. Structure of a Large Test
      1. The System Under Test
      2. Test Data
      3. Verification
    4. Types of Larger Tests
      1. Functional Testing of One or More Interacting Binaries
      2. Browser and Device Testing
      3. Performance, Load, and Stress testing
      4. Deployment Configuration Testing
      5. Exploratory Testing
      6. A/B Diff Regression Testing
      7. UAT
      8. Probers and Canary Analysis
      9. Disaster Recovery and Chaos Engineering
      10. User Evaluation
    5. Large Tests and the Developer Workflow
      1. Authoring Large Tests
      2. Running Large Tests
      3. Owning Large Tests
    6. Conclusion
    7. TL;DRs
  20. 15. Deprecation
    1. Why Deprecate?
    2. Why Is Deprecation So Hard?
      1. Deprecation During Design
    3. Types of Deprecation
      1. Advisory Deprecation
      2. Compulsory Deprecation
      3. Deprecation Warnings
    4. Managing the Deprecation Process
      1. Process Owners
      2. Milestones
      3. Deprecation Tooling
    5. Conclusion
    6. TL;DRs
  21. IV. Tools
  22. 16. Version Control and Branch Management
    1. What Is Version Control?
      1. Why Is Version Control Important?
      2. Centralized VCS Versus Distributed VCS
      3. Source of Truth
      4. Version Control Versus Dependency Management
    2. Branch Management
      1. Work in Progress Is Akin to a Branch
      2. Dev Branches
      3. Release Branches
    3. Version Control at Google
      1. One Version
      2. Scenario: Multiple Available Versions
      3. The “One-Version” Rule
      4. (Nearly) No Long-Lived Branches
      5. What About Release Branches?
    4. Monorepos
    5. Future of Version Control
    6. Conclusion
    7. TL;DRs
  23. 17. Code Search
    1. The Code Search UI
    2. How Do Googlers Use Code Search?
      1. Where?
      2. What?
      3. How?
      4. Why?
      5. Who and When?
    3. Why a Separate Web Tool?
      1. Scale
      2. Zero Setup Global Code View
      3. Specialization
      4. Integration with Other Developer Tools
      5. API Exposure
    4. Impact of Scale on Design
      1. Search Query Latency
      2. Index Latency
    5. Google’s Implementation
      1. Search Index
      2. Ranking
    6. Selected Trade-Offs
      1. Completeness: Repository at Head
      2. Completeness: All Versus Most-Relevant Results
      3. Completeness: Head Versus Branches Versus All History Versus Workspaces
      4. Expressiveness: Token Versus Substring Versus Regex
    7. Conclusion
    8. TL;DRs
  24. 18. Build Systems and Build Philosophy
    1. Purpose of a Build System
    2. What Happens Without a Build System?
      1. But All I Need Is a Compiler!
      2. Shell Scripts to the Rescue?
    3. Modern Build Systems
      1. It’s All About Dependencies
      2. Task-Based Build Systems
      3. Artifact-Based Build Systems
      4. Distributed Builds
      5. Time, Scale, Trade-Offs
    4. Dealing with Modules and Dependencies
      1. Using Fine-Grained Modules and the 1:1:1 Rule
      2. Minimizing Module Visibility
      3. Managing Dependencies
    5. Conclusion
    6. TL;DRs
  25. 19. Critique: Google’s Code Review Tool
    1. Code Review Tooling Principles
    2. Code Review Flow
      1. Notifications
    3. Stage 1: Create a Change
      1. Diffing
      2. Analysis Results
      3. Tight Tool Integration
    4. Stage 2: Request Review
    5. Stages 3 and 4: Understanding and Commenting on a Change
      1. Commenting
      2. Understanding the State of a Change
    6. Stage 5: Change Approvals (Scoring a Change)
    7. Stage 6: Commiting a Change
      1. After Commit: Tracking History
    8. Conclusion
    9. TL;DRs
  26. 20. Static Analysis
    1. Characteristics of Effective Static Analysis
      1. Scalability
      2. Usability
    2. Key Lessons in Making Static Analysis Work
      1. Focus on Developer Happiness
      2. Make Static Analysis a Part of the Core Developer Workflow
      3. Empower Users to Contribute
    3. Tricorder: Google’s Static Analysis Platform
      1. Integrated Tools
      2. Integrated Feedback Channels
      3. Suggested Fixes
      4. Per-Project Customization
      5. Presubmits
      6. Compiler Integration
      7. Analysis While Editing and Browsing Code
    4. Conclusion
    5. TL;DRs
  27. 21. Dependency Management
    1. Why Is Dependency Management So Difficult?
      1. Conflicting Requirements and Diamond Dependencies
    2. Importing Dependencies
      1. Compatibility Promises
      2. Considerations When Importing
      3. How Google Handles Importing Dependencies
    3. Dependency Management, In Theory
      1. Nothing Changes (aka The Static Dependency Model)
      2. Semantic Versioning
      3. Bundled Distribution Models
      4. Live at Head
    4. The Limitations of SemVer
      1. SemVer Might Overconstrain
      2. SemVer Might Overpromise
      3. Motivations
      4. Minimum Version Selection
      5. So, Does SemVer Work?
    5. Dependency Management with Infinite Resources
      1. Exporting Dependencies
    6. Conclusion
    7. TL;DRs
  28. 22. Large-Scale Changes
    1. What Is a Large-Scale Change?
    2. Who Deals with LSCs?
    3. Barriers to Atomic Changes
      1. Technical Limitations
      2. Merge Conflicts
      3. No Haunted Graveyards
      4. Heterogeneity
      5. Testing
      6. Code Review
    4. LSC Infrastructure
      1. Policies and Culture
      2. Codebase Insight
      3. Change Management
      4. Testing
      5. Language Support
    5. The LSC Process
      1. Authorization
      2. Change Creation
      3. Sharding and Submitting
      4. Cleanup
    6. Conclusion
    7. TL;DRs
  29. 23. Continuous Integration
    1. CI Concepts
      1. Fast Feedback Loops
      2. Automation
      3. Continuous Testing
      4. CI Challenges
      5. Hermetic Testing
    2. CI at Google
      1. CI Case Study: Google Takeout
      2. But I Can’t Afford CI
    3. Conclusion
    4. TL;DRs
  30. 24. Continuous Delivery
    1. Idioms of Continuous Delivery at Google
    2. Velocity Is a Team Sport: How to Break Up a Deployment into Manageable Pieces
    3. Evaluating Changes in Isolation: Flag-Guarding Features
    4. Striving for Agility: Setting Up a Release Train
      1. No Binary Is Perfect
      2. Meet Your Release Deadline
    5. Quality and User-Focus: Ship Only What Gets Used
    6. Shifting Left: Making Data-Driven Decisions Earlier
    7. Changing Team Culture: Building Discipline into Deployment
    8. Conclusion
    9. TL;DRs
  31. 25. Compute as a Service
    1. Taming the Compute Environment
      1. Automation of Toil
      2. Containerization and Multitenancy
      3. Summary
    2. Writing Software for Managed Compute
      1. Architecting for Failure
      2. Batch Versus Serving
      3. Managing State
      4. Connecting to a Service
      5. One-Off Code
    3. CaaS Over Time and Scale
      1. Containers as an Abstraction
      2. One Service to Rule Them All
      3. Submitted Configuration
    4. Choosing a Compute Service
      1. Centralization Versus Customization
      2. Level of Abstraction: Serverless
      3. Public Versus Private
    5. Conclusion
    6. TL;DRs
  32. V. Conclusion
  33. Afterword
  34. Index

Product information

  • Title: Software Engineering at Google
  • Author(s): Titus Winters, Tom Manshreck, Hyrum Wright
  • Release date: March 2020
  • Publisher(s): O'Reilly Media, Inc.
  • ISBN: 9781492082798