Software Mistakes and Tradeoffs

Book description

Optimize the decisions that define your code by exploring the common mistakes and intentional tradeoffs made by expert developers.

In Software Mistakes and Tradeoffs you will learn how to:

  • Reason about your systems to make intuitive and better design decisions
  • Understand consequences and how to balance tradeoffs
  • Pick the right library for your problem
  • Thoroughly analyze all of your service’s dependencies
  • Understand delivery semantics and how they influence distributed architecture
  • Design and execute performance tests to detect code hot paths and validate a system’s SLA
  • Detect and optimize hot paths in your code to focus optimization efforts on root causes
  • Decide on a suitable data model for date/time handling to avoid common (but subtle) mistakes
  • Reason about compatibility and versioning to prevent unexpected problems for API clients
  • Understand tight/loose coupling and how it influences coordination of work between teams
  • Clarify requirements until they are precise, easily implemented, and easily tested
  • Optimize your APIs for friendly user experience

Code performance versus simplicity. Delivery speed versus duplication. Flexibility versus maintainability—every decision you make in software engineering involves balancing tradeoffs. In Software Mistakes and Tradeoffs you’ll learn from costly mistakes that Tomasz Lelek and Jon Skeet have encountered over their impressive careers. You’ll explore real-world scenarios where poor understanding of tradeoffs lead to major problems down the road, so you can pre-empt your own mistakes with a more thoughtful approach to decision making.

Learn how code duplication impacts the coupling and evolution speed of your systems, and how simple-sounding requirements can have hidden nuances with respect to date and time information. Discover how to efficiently narrow your optimization scope according to 80/20 Pareto principles, and ensure consistency in your distributed systems. You’ll soon have built up the kind of knowledge base that only comes from years of experience.

About the Technology
Every step in a software project involves making tradeoffs. When you’re balancing speed, security, cost, delivery time, features, and more, reasonable design choices may prove problematic in production. The expert insights and relatable war stories in this book will help you make good choices as you design and build applications.

About the Book
Software Mistakes and Tradeoffs explores real-world scenarios where the wrong tradeoff decisions were made and illuminates what could have been done differently. In it, authors Tomasz Lelek and Jon Skeet share wisdom based on decades of software engineering experience, including some delightfully instructive mistakes. You’ll appreciate the specific tips and practical techniques that accompany each example, along with evergreen patterns that will change the way you approach your next projects.

What's Inside
  • How to reason about your software systematically
  • How to pick tools, libraries, and frameworks
  • How tight and loose coupling affect team coordination
  • Requirements that are precise, easy to implement, and easy to test


About the Reader
For mid- and senior-level developers and architects who make decisions about software design and implementation.

About the Authors
Tomasz Lelek works daily with a wide range of production services, architectures, and JVM languages. A Google engineer and author of C# in Depth, Jon Skeet is famous for his many practical contributions to Stack Overflow.

Quotes
Great book that I wish I had earlier in my career. Many hard-learned lessons contained in these pages.
- Dave Corun, Avanade

Clear and to-the-point summation of years of real-life experience in software engineering. A must-read for all newcomers to the software engineering world.
- Rafael Avila Martinez, Mastercard

Shines a light on the intrinsic conflicts of the programming process and how they impact the code you write.
- Roberto Casadei, Università di Bologna

Summarizes the main pain points for every software developer and presents solutions in a clear and didactic way.
- Nelson González, General Electric

Publisher resources

View/Submit Errata

Table of contents

  1. inside front cover
  2. Software Mistakes and Tradeoffs
  3. Copyright
  4. dedication
  5. contents
  6. front matter
    1. preface
    2. acknowledgments
    3. about this book
      1. Who should read this book
      2. How this book is organized
      3. About the code
      4. liveBook discussion forum
    4. about the authors
    5. about the cover illustration
  7. 1 Introduction
    1. 1.1 Consequences of every decision and pattern
      1. 1.1.1 Unit testing decisions
      2. 1.1.2 Proportions of unit and integration tests
    2. 1.2 Code design patterns and why they do not always work
      1. 1.2.1 Measuring our code
    3. 1.3 Architecture design patterns and why they do not always work
      1. 1.3.1 Scalability and elasticity
      2. 1.3.2 Development speed
      3. 1.3.3 Complexity of microservices
    4. Summary
  8. 2 Code duplication is not always bad: Code duplication vs. flexibility
    1. 2.1 Common code between codebases and duplication
      1. 2.1.1 Adding a new business requirement that requires code duplication
      2. 2.1.2 Implementing the new business requirement
      3. 2.1.3 Evaluating the result
    2. 2.2 Libraries and sharing code between codebases
      1. 2.2.1 Evaluating the tradeoffs and disadvantages of shared libraries
      2. 2.2.2 Creating a shared library
    3. 2.3 Code extraction to a separate microservice
      1. 2.3.1 Looking at the tradeoffs and disadvantages of a separate service
      2. 2.3.2 Conclusions about separate service
    4. 2.4 Improving loose coupling by code duplication
    5. 2.5 An API design with inheritance to reduce duplication
      1. 2.5.1 Extracting a base request handler
      2. 2.5.2 Looking at inheritance and tight coupling
      3. 2.5.3 Looking at the tradeoffs between inheritance and composition
      4. 2.5.4 Looking at inherent and incidental duplication
    6. Summary
  9. 3 Exceptions vs. other patterns of handling errors in your code
    1. 3.1 Hierarchy of exceptions
      1. 3.1.1 Catch-all vs. a more granular approach to handling errors
    2. 3.2 Best patterns to handle exceptions in the code that you own
      1. 3.2.1 Handling checked exceptions in a public API
      2. 3.2.2 Handling unchecked exceptions in a public API
    3. 3.3 Anti-patterns in exception handling
      1. 3.3.1 Closing resources in case of an error
      2. 3.3.2 Anti-pattern of using exceptions to control application flow
    4. 3.4 Exceptions from third-party libraries
    5. 3.5 Exceptions in multithread environments
      1. 3.5.1 Exceptions in an async workflow with a promise API
    6. 3.6 Functional approach to handling errors with Try
      1. 3.6.1 Using Try in production code
      2. 3.6.2 Mixing Try with code that throws an exception
    7. 3.7 Performance comparison of exception-handling code
    8. Summary
  10. 4 Balancing flexibility and complexity
    1. 4.1 A robust but not extensible API
      1. 4.1.1 Designing a new component
      2. 4.1.2 Starting with the most straightforward code
    2. 4.2 Allowing clients to provide their own metrics framework
    3. 4.3 Providing extensibility of your APIs via hooks
      1. 4.3.1 Guarding against unpredictable usage of the hooks API
      2. 4.3.2 Performance impact of the hook API
    4. 4.4 Providing extensibility of your APIs via listeners
      1. 4.4.1 Using listeners vs. hooks
      2. 4.4.2 Immutability of our design
    5. 4.5 Flexibility analysis of an API vs. the cost of maintenance
    6. Summary
  11. 5 Premature optimization vs. optimizing the hot path: Decisions that impact code performance
    1. 5.1 When premature optimization is evil
      1. 5.1.1 Creating accounts processing pipeline
      2. 5.1.2 Optimizing processing based on false assumptions
      3. 5.1.3 Benchmarking performance optimization
    2. 5.2 Hot paths in your code
      1. 5.2.1 Understanding the Pareto principle in the context of software systems
      2. 5.2.2 Configuring the number of concurrent users (threads) for a given SLA
    3. 5.3 A word service with a potential hot path
      1. 5.3.1 Getting the word of the day
      2. 5.3.2 Validating if the word exists
      3. 5.3.3 Exposing the WordsService using HTTP service
    4. 5.4 Hot path detection in your code
      1. 5.4.1 Creating API performance tests using Gatling
      2. 5.4.2 Measuring code paths using MetricRegistry
    5. 5.5 Improvements for hot path performance
      1. 5.5.1 Creating JMH microbenchmark for the existing solution
      2. 5.5.2 Optimizing word exists using a cache
      3. 5.5.3 Modifying performance tests to have more input words
    6. Summary
  12. 6 Simplicity vs. cost of maintenance for your API
    1. 6.1 A base library used by other tools
      1. 6.1.1 Creating a cloud service client
      2. 6.1.2 Exploring authentication strategies
      3. 6.1.3 Understanding the configuration mechanism
    2. 6.2 Directly exposing settings of a dependent library
      1. 6.2.1 Configuring the batch tool
    3. 6.3 A tool that is abstracting settings of a dependent library
      1. 6.3.1 Configuring the streaming tool
    4. 6.4 Adding new setting for the cloud client library
      1. 6.4.1 Adding a new setting to the batch tool
      2. 6.4.2 Adding a new setting to the streaming tool
      3. 6.4.3 Comparing both solutions for UX friendliness and maintainability
    5. 6.5 Deprecating/removing a setting in the cloud client library
      1. 6.5.1 Removing a setting from the batch tool
      2. 6.5.2 Removing a setting from the streaming tool
      3. 6.5.3 Comparing both solutions for UX friendliness and maintainability
    6. Summary
  13. 7 Working effectively with date and time data
    1. 7.1 Concepts in date and time information
      1. 7.1.1 Machine time: Instants, epochs, and durations
      2. 7.1.2 Civil time: Calendar systems, dates, times, and periods
      3. 7.1.3 Time zones, UTC, and offsets from UTC
      4. 7.1.4 Date and time concepts that hurt my head
    2. 7.2 Preparing to work with date and time information
      1. 7.2.1 Limiting your scope
      2. 7.2.2 Clarifying date and time requirements
      3. 7.2.3 Using the right libraries or packages
    3. 7.3 Implementing date and time code
      1. 7.3.1 Applying concepts consistently
      2. 7.3.2 Improving testability by avoiding defaults
      3. 7.3.3 Representing date and time values in text
      4. 7.3.4 Explaining code with comments
    4. 7.4 Corner cases to specify and test
      1. 7.4.1 Calendar arithmetic
      2. 7.4.2 Time zone transitions at midnight
      3. 7.4.3 Handling ambiguous or skipped times
      4. 7.4.4 Working with evolving time zone data
    5. Summary
  14. 8 Leveraging data locality and memory of your machines
    1. 8.1 What is data locality?
      1. 8.1.1 Moving computations to data
      2. 8.1.2 Scaling processing using data locality
    2. 8.2 Data partitioning and splitting data
      1. 8.2.1 Offline big data partitioning
      2. 8.2.2 Partitioning vs. sharding
      3. 8.2.3 Partitioning algorithms
    3. 8.3 Join big data sets from multiple partitions
      1. 8.3.1 Joining data within the same physical machine
      2. 8.3.2 Joining that requires data movement
      3. 8.3.3 Optimizing join leveraging broadcasting
    4. 8.4 Data processing: Memory vs. disk
      1. 8.4.1 Using disk-based processing
      2. 8.4.2 Why do we need MapReduce?
      3. 8.4.3 Calculating access times
      4. 8.4.4 RAM-based processing
    5. 8.5 Implement joins using Apache Spark
      1. 8.5.1 Implementing a join without broadcast
      2. 8.5.2 Implementing a join with broadcast
    6. Summary
  15. 9 Third-party libraries: Libraries you use become your code
    1. 9.1 Importing a library and taking full responsibility for its settings: Beware of the defaults
    2. 9.2 Concurrency models and scalability
      1. 9.2.1 Using async and sync APIs
      2. 9.2.2 Distributed scalability
    3. 9.3 Testability
      1. 9.3.1 Testing library
      2. 9.3.2 Testing with fakes (test double) and mocks
      3. 9.3.3 Integration testing toolkit
    4. 9.4 Dependencies of third-party libraries
      1. 9.4.1 Avoiding version conflicts
      2. 9.4.2 Too many dependencies
    5. 9.5 Choosing and maintaining third-party dependencies
      1. 9.5.1 First impressions
      2. 9.5.2 Different approaches to reusing code
      3. 9.5.3 Vendor lock-in
      4. 9.5.4 Licensing
      5. 9.5.5 Libraries vs. frameworks
      6. 9.5.6 Security and updates
      7. 9.5.7 Decision checklist
    6. Summary
  16. 10 Consistency and atomicity in distributed systems
    1. 10.1 At-least-once delivery of data sources
      1. 10.1.1 Traffic between one-node services
      2. 10.1.2 Retrying an application’s call
      3. 10.1.3 Producing data and idempotency
      4. 10.1.4 Understanding Command Query Responsibility Segregation (CQRS)
    2. 10.2 A naive implementation of a deduplication library
    3. 10.3 Common mistakes when implementing deduplication in distributed systems
      1. 10.3.1 One node context
      2. 10.3.2 Multiple nodes context
    4. 10.4 Making your logic atomic to prevent race conditions
    5. Summary
  17. 11 Delivery semantics in distributed systems
    1. 11.1 Architecture of event-driven applications
    2. 11.2 Producer and consumer applications based on Apache Kafka
      1. 11.2.1 Looking at the Kafka consumer side
      2. 11.2.2 Understanding the Kafka brokers setup
    3. 11.3 The producer logic
      1. 11.3.1 Choosing consistency vs. availability for the producer
    4. 11.4 Consumer code and different delivery semantics
      1. 11.4.1 Committing a consumer manually
      2. 11.4.2 Restarting from the earliest or latest offsets
      3. 11.4.3 (Effectively) exactly-once semantic
    5. 11.5 Leveraging delivery guarantees to provide fault tolerance
    6. Summary
  18. 12 Managing versioning and compatibility
    1. 12.1 Versioning in the abstract
      1. 12.1.1 Properties of versions
      2. 12.1.2 Backward and forward compatibility
      3. 12.1.3 Semantic versioning
      4. 12.1.4 Marketing versions
    2. 12.2 Versioning for libraries
      1. 12.2.1 Source, binary, and semantic compatibility
      2. 12.2.2 Dependency graphs and diamond dependencies
      3. 12.2.3 Techniques for handling breaking changes
      4. 12.2.4 Managing internal-only libraries
    3. 12.3 Versioning for network APIs
      1. 12.3.1 The context of network API calls
      2. 12.3.2 Customer-friendly clarity
      3. 12.3.3 Common versioning strategies
      4. 12.3.4 Further versioning considerations
    4. 12.4 Versioning for data storage
      1. 12.4.1 A brief introduction to Protocol Buffers
      2. 12.4.2 What is a breaking change?
      3. 12.4.3 Migrating data within a storage system
      4. 12.4.4 Expecting the unexpected
      5. 12.4.5 Separating API and storage representations
      6. 12.4.6 Evaluating storage formats
    5. Summary
  19. 13 Keeping up to date with trends vs. cost of maintenance of your code
    1. 13.1 When to use dependency injection frameworks
      1. 13.1.1 Do-it-yourself (DIY) dependency injection
      2. 13.1.2 Using a dependency injection framework
    2. 13.2 When to use reactive programming
      1. 13.2.1 Creating single-threaded, blocking processing
      2. 13.2.2 Using CompletableFuture
      3. 13.2.3 Implementing a reactive solution
    3. 13.3 When to use functional programming
      1. 13.3.1 Creating functional code in a nonfunctional language
      2. 13.3.2 Tail recursion optimization
      3. 13.3.3 Leveraging immutability
    4. 13.4 Using lazy vs. eager evaluation
    5. Summary
  20. index
  21. inside back cover

Product information

  • Title: Software Mistakes and Tradeoffs
  • Author(s): Tomasz Lelek, Jonathan Skeet
  • Release date: May 2022
  • Publisher(s): Manning Publications
  • ISBN: 9781617299209