Chapter 1. Toward a Microservices Architecture

The goal of this book is to help you build a working microservices architecture. In pages you’ll find opinionated and prescriptive advice for building software. That advice comes from real practitioner experiences that we’ve gathered, both from successful implementations and the ones that could have gone better. We’ve refined these lessons into a model that we hope will get you up and running faster with your own system.

Recently, the microservices style of building software has exploded in popularity. In the early 2010s, the term microservices emerged as a way to describe a new style of software architecture. Applications built in this newly named style are built with small, independent components that work together. Since then, adoption rates for the microservices style have skyrocketed. Startups, enterprise companies, and everyone in between have been learning and implementing microservices-style architectures. The growing ecosystem of tools, services, and solutions in this space is testament to its widespread popularity. At the time of this writing, Allied Market Research has predicted that the global market for microservices architectures will grow to $8.07 billion USD in 2026, from the current $2.07 billion USD. These kinds of numbers indicate a lot of interest, a lot of adoption, and lots and lots of microservices work.

For many, building software in the microservices way has turned out to be a challenge. The truth is that implementing a microservices system isn’t easy. Making lots of independent parts work together is harder to do than it might sound. Management, maintenance, support, and testing costs add up in the system. At scale, those costs can become prohibitive. If you aren’t careful, the pain of managing the system can make microservices seem like a bad idea.

But the benefits of building microservices make the risks worthwhile. Microservices done well enable you to make software changes faster and safer at scale. Faster and safer change means more agility for your business. That agility translates to better outcomes for your business and your organization.

The trick to unlocking all that value is to have the right architecture in place to support the services. It needs to reduce system costs, without diminishing the value of independent services. To build that architecture, you’ll need to make important decisions early. Those decisions will span methods, processes, teams, technologies, and tools. They’ll also need to work together to form an emergent, optimized whole.

A good way to build a system like this is through evolution. You could start with a few small decisions and learn and grow as you go. In fact, most early adopters ended up with microservices through iterative experimentation. They didn’t set out with a goal of building a microservices-based application. Instead, they ended up with them through a continuous process of optimization and improvement.

Starting from scratch and iterating takes time. But the good news is that you can use the experiences of these practitioners to help you build your system faster. Begin your build with a foundation of patterns, methods, and tools that have been used together successfully. Then optimize the system to meet the unique goals and constraints of your organization.

In this book, we’ve documented the decisions that form a strong microservices foundation. Before we can dive into the details of the model, let’s address an important question. What exactly do we mean by “microservices”?

What Are Microservices?

There isn’t one official, canonical definition for microservices. A good starting place is James Lewis and Martin Fowler’s seminal article on microservices from 2014. In that piece, they describe microservices as:

an approach to developing a single application as a suite of small services, each running in its own process and communicating with lightweight mechanisms. […] built around business capabilities and independently deployable by fully automated deployment machinery.

The real heart of Lewis and Fowler’s article is the set of nine characteristics that microservices possess. Their list starts with the core microservice characteristic of componentization via services, which means breaking an application into smaller services. From there they go on to cover a wide breadth of capabilities. They document the need for organizational and management design with the characteristics of organization around business capabilities and decentralized governance. They hint at DevOps and Agile delivery practices when they introduce infrastructure automation and products not projects. They also identify a few key architecture principles, such as smart endpoints and dumb pipes, design for failure, and evolutionary design.

Each of these characteristics is worth understanding, and we encourage you to read their article if you haven’t already. Together, these characteristics form a holistic solution with a very large set of concerns. It includes technology, infrastructure, engineering, operationalization, governance, team structure, and culture.

For contrast, here is another definition for microservices from the book Microservice Architecture by Irakli Nadareishvili, Ronnie Mitra, Matt McLarty, and Mike Amundsen (O’Reilly):

A microservice is an independently deployable component of bounded scope that supports interoperability through message-based communication. Microservice architecture is a style of engineering highly automated, evolvable software systems made up of capability-aligned microservices.

This definition is similar to Lewis and Fowler’s, but it pays special attention to bounded scopes, interoperability, and message-based communication. It also makes a distinction between microservices and the architecture that enables them.

These are just two examples from a sea of microservices definitions. As with these examples, most definitions are broadly similar, but each of them differs slightly in their focus. But they’re usually different enough that it becomes hard to gauge if you’ve built a textbook microservices system.

In the world of technology, names are important because they give us a simple way of communicating complex concepts. In this case, the “microservices” label allows us to describe a style of software architecture that has three general design traits:

  1. The application architecture is primarily composed of machine-invocable “services” that are made available on a network.

  2. The sizes (or boundaries) of services are an important design factor. These boundaries include runtime, design-time, and people factors.

  3. The software system, organization, and way of working are holistically optimized to achieve a goal.

This is a pretty general set of design traits. For example, it doesn’t document organizational styles, specific tools, or architectural principles that should be used. There also aren’t any formal patterns or practices defined. Instead, this gives us just enough characteristics to be able to identify a microservices system when we see one.

The truth is, you can get away with calling almost any API-based system a microservices architecture if you try hard enough. But the real focus should be on the goal of your system. We think that question of why you’d build microservices is much more enlightening than the question of what they are. While there are lots of potential benefits to microservices, we believe the best reason to build software this way is to reduce your coordination costs.

Reducing Coordination Costs

Companies around the world have had success implementing microservices architectures. Almost universally, the practitioners we’ve talked to have reported an increase in speed of software delivery. We believe that improvement comes from the fundamental benefit of the microservices style: a reduction in coordination costs.

It should be pointed out that there are many ways to increase speed in software engineering. Building software the microservices way is just one option. For example, you could build a system quickly by cutting corners and incurring “technical debt” that you’ll deal with later. Or, you could focus less on stability and safety and just get your product out the door. In some situations and for some businesses these are reasonable approaches.

But systems developed for the financial, healthcare, and government sectors, among others, are not allowed to compromise on safety for the sake of speed. And yet, competitive and market forces demand higher speed from these industries just like any other. This is where a microservices system can shine. It provides an architectural approach that allows you to increase speed without compromising safety. And it lets you do that at scale.

The Coordination Cost Problem

Building complex software is hard work. In films and on TV, a brilliant programmer can heroically engineer a world-changing product over the course of a sleepless weekend. In real life it takes lots of people and a whole lot of time to produce a quality result. Multiple teams working on a complex project are typically implementing different parts of said system, following independent roadmaps, at independent paces. Periodically, these parts need to be integrated to resolve dependencies, at which point the mostly autonomous teams need to coordinate their work (see Figure 1-1).

Imagine that Jane is the team lead in charge of the Accounting workstream. Her team just finished a sprint and has a dependency on a component being developed by the team in charge of the Shipment module, led by Tyrone. Since roadmaps are independent, it could be that Tyrone’s team is not actually done with their implementation of the needed component, in the Shipment workstream. At this point Jane has one of two choices: she can either wait for the component to be delivered (prioritizing safety but sacrificing speed by putting her team on halt) and do a proper integration test, or she can rely on an agreed interface contract between her team and Tyrone’s, assuming that his team will deliver the component exactly as planned. In the latter case, Jane would proceed without interruption, increasing her team’s speed, but potentially compromising the overall safety of the system since integration testing didn’t occur at the earliest possible stage and a “happy path” assumption was made.

msur 0101
Figure 1-1. Sample timeline of a complex project with coordination touchpoints

Any team lead in a complex, multiteam environment regularly faces this choice between ignoring coordination costs and keeping momentum versus acknowledging the need for coordination and slowing down. Typically we choose one or the other using our intuition on risk versus benefits, but overall, in a sufficiently complex system, when these choices occur frequently enough there is a very pronounced tension between speed and safety.

The tension is real; however, it is not related to our primal instincts and there is a way to fix it. Since coordination costs cause the tension, what if we had a system specifically designed in a way to minimize those coordination costs? What if instead of choosing one way or the other, teams did not even face the choice most of the time? You can have such a design, emphasizing the minimization of coordination, if you have autonomous teams working on small batches of isolated work. And that is exactly what microservices architecture is all about, in its essence.

Understanding that the fundamental force of building successful microservices architectures is aiming for the minimization of coordination is extremely useful. It gives us a universal litmus test. Building complex distributed systems such as a microservices architecture isn’t easy, and when in doubt we should always ask ourselves, “Is this decision I am facing going to reduce coordination costs for my teams or not?” The right answer will be much more obvious when we view decisions from the perspective of coordination costs.

Ultimately, microservices have become popular because they help businesses succeed. Modern organizations are under incredible pressure to adapt, change, and improve more often and more quickly. Investing in a technology architecture that is purposefully designed to change speed and change safety at the scale of a large organization makes a lot of sense. The microservices style enables companies operating in complex domains to have the agility of a simpler, smaller company while continuing to harness the power and reach of their actual size. It’s incredibly appealing and the growth in adoption proves that—however, the benefits don’t come for free. It takes a lot of up-front work, focus, and decision making to build a microservices architecture that can unlock that value.

The Hard Parts

One of the biggest hurdles that first-time microservices adopters face is dealing with the enormous scope and breadth of a microservices system. You might start by focusing on creating smaller, bounded services. But very soon you’ll find yourself having to come up with the right infrastructure, data models, frameworks, team models, and processes to support them. It’s a lot of ground to cover and dealing with all of that scope can lead to some unique challenges. Here are the three big design problems that microservices architects and engineers usually face:

Long feedback loops

One big challenge is that impactful decisions in a microservices system aren’t easy to measure. From the decisions you make today problems may emerge, but they may not show up until much later. For example, when you start out you might decide to use a shared communication library to make it easier for your services to talk to each other. Over time it may become clear that keeping that library up to date across all of your microservices and teams turns out to be a huge problem. The crux of the problem here is that it’s difficult to understand the impact of the decision you’re making until problems arise, which makes it difficult to evaluate options and choose among them.

Too many moving parts

At its heart, a microservices system is a complex adaptive system. This means that each part of the system impacts the other parts in some way. When all those parts come together an emergent system behavior is produced. If you’ve ever introduced a new tool or a new process into an organization, you’ve probably seen this firsthand. Some teams take to new stimuli and change immediately, others need help and support to adapt, but no matter what, you almost always end up with consequences as to the way people work and the decisions that are made. For example, technology teams who introduce Docker containerization tooling inevitably end up adapting their development and release life cycle as a consequence of their adopting the container deployment model. Sometimes these consequences are planned, but often we need to deal with the unintended consequences of the changes that are introduced. This complexity is what makes microservices system design difficult. It’s difficult to predict the specific impacts of the changes that are introduced, leading to a risk that we’ll do more harm than good with a new architecture model.

Analysis paralysis

When we compound the problem of long feedback loops for our decision with the complex system we need to design, it’s easy to see why microservices architecture is a challenge. The decisions you need to make are both highly impactful and difficult to measure. This can lead to endless speculation, discussion, and evaluation of architectural decisions because of the fear of making the wrong kind of system. Instead of building a system that can achieve business outcomes, we end up in a state of indecision, trying to model the endless permutations of our choices. This condition is commonly known as analysis paralysis. It doesn’t help that the web is full of horror stories, “bumper sticker” advice, and contradictory best practices for building a microservices architecture.

Ultimately, the real challenge of building a microservices architecture is that of dealing with a big, complicated system that spans a huge scope. The good news is that this is not a unique problem to solve. In this book, we’ll be bringing together and using a set of practices and patterns that have evolved for this type of domain. We’ll also be introducing and implementing tools that embody these ways of working and make the work that happens in a microservices system easier, safer, cheaper, and faster.

Learning by Doing

So far, we’ve established that the microservices style can help you deliver software faster without compromising on safety. But we’ve also identified that the path to a good microservices architecture is difficult and fraught with challenging and complex decisions. Many of the successful microservices implementers we’ve talked to have built their systems through continual iteration and improvement. Frequently, they’ve had to build architectures that failed before they unlocked an understanding of how to build a system that works.

If you had unlimited time, you could build a great microservices architecture solely through experimentation. You could adopt endless organizational models, try every methodology, and build microservices of various sizes. As long as you could measure your results, you’d continue to improve the system. With enough trials, you’d end up with a system that works for you as well as a lot of experience building microservice systems.

Chances are, though, that you don’t have the luxury of unlimited time. So, how do you build the expertise you need to build better microservices?

To help address this challenge, we’ve developed a prescriptive microservices model. We’ve made decisions about team design, process, architecture, infrastructure, and even tools and technologies. We’ll cover a large scope of topic areas while building a solution that brings those areas together. Our decisions are built on opinions based in experiences building microservices systems for large organizations. If you follow our instructions, by the end of the book you’ll have built a simple, operational microservices system in a cloud-based architecture.

Note

To help bring our microservices examples to life, we’ll be using the backdrop of a fictional airline reservations system. It will be a vastly simplified version of what a real reservations system would look like. Our very basic airline reservations system will include two functions: a read-only flight information service and a seat reservation service.

Our goal is to guide you in building your first microservices implementation as quickly as possible. In our experience, the act of building a real system is the best way to gain a true understanding of the work involved and the key decisions. We don’t expect you to agree with all of our decisions. In fact, questioning the decisions we’ve made for you is a big part of the learning journey! We hope that the model we build together is only the first of many microservices systems that you’ll build.

The Dreyfus Model of Skill Acquisition

Starting a learning journey by following instructions is a tried-and-true path to gaining expertise. In Stuart and Hubert Dreyfus’s Five-Stage Model of Adult Skill Acquisition, the first stage involves following prescriptive guidance before proficiency and expertise are established.

The “Up and Running” Microservices Model

The scope of a microservices architecture is quite large. Unfortunately, we can’t cover the entire scope in this single book. However, we’ve made an effort to cover the topic areas that are the most relevant to a microservices system and have the biggest impact on success. Let’s take a quick look at what we’ll be covering in our “up and running” microservices model.

Team design

We’ll kick off our build in Chapter 2 by tackling the people side of a microservices system. We’ll uncover the challenges of effective team design and the fundamental factors that influence microservice coordination. We’ll also introduce the teams we’ll be using within our example system along with a tool called Team Topologies to help design them.

Microservice design

After designing the teams, we’ll introduce the SEED(S) process in Chapter 3. This is a design process that will help us create microservices that fulfill the needs of users and consumers with actionable interfaces and behaviors. Then, in Chapter 4, we’ll take on the problem of designing the right boundaries for our example microservices. We’ll also introduce some important Domain-Driven Design concepts and use a process called Event Storming to “rightsize” our services.

Data design

Data is one of the most difficult aspects of a microservices design. In Chapter 5, we’ll take a look at the data factors you’ll need to consider in a microservices system. We’ll introduce the concept of data independence and lay the groundwork for the data architecture in our example project.

Cloud platform

Our microservices implementation will be built on top of a cloud-based infrastructure. In Chapter 6, we’ll introduce and implement the principles of immutable infrastructure and infrastructure as code (IaC) as the foundation for our microservices infrastructure. We’ll also introduce AWS as our cloud platform and build a GitHub Actions–based CI/CD pipeline. Then, in Chapter 7, we’ll use that pipeline to design and develop an AWS-based microservices infrastructure that will include networking, a Kubernetes cluster, and a GitOps deployment tool.

Microservices development

With our infrastructure platform in place, we’ll dive into the work of engineering the microservices. We’ll start by covering the principles and tools you’ll need to succeed in Chapter 8. Then in Chapter 9, we’ll implement two independent, heterogeneous microservices for our example application.

Release and change

We’ll bring the whole solution together in Chapter 10, where we’ll deploy one of the example microservices we’ve engineered onto the cloud-based platform we’ve developed. To do this, we’ll use a set of technologies including DockerHub, Kubernetes, Helm, and Argo CD. Finally, after release, we’ll take a retrospective look at the system in Chapter 11.

Note

The model we’ve developed is built on a set of five guiding principles, including the twelve-factor app pattern. If you’re interested, you can read about our model’s guiding principles at this book’s GitHub repository.

Hopefully this short overview gives you an idea of the scope of our model and example application. By the end of the book we’ll have implemented a full-fledged system. To get there, we’ll need to make a lot of decisions. So, the first tool we’ll need is a way of keeping track of the really important ones.

Decisions, Decisions…

When it comes to building software, decisions are a big deal. Professional software engineers and architects get paid a lot for the decisions that they make and the problems they solve. The quality of the software and the business outcomes they drive depend on the quality of those decisions.

But decisions aren’t always easy to make. They also aren’t always correct. We make the best decisions we can given the information, experience, and talent that we have. When any of those variables change, our decisions should change too. Some decisions are correct at the time, but become outdated when technology, people, or situations change. Some decisions were never good ones in the first place. In either case, we need a way of capturing the decisions that matter so we can re-evaluate and improve on them over time.

To address that need, we’re going to use a tool called an architecture decision record (or ADR). We’re not sure who invented the term ADR or when it was first used, but the idea of documenting design decisions has been around for a long time. The real problem is that most people don’t take the time to do it. In our experience, ADRs are an extremely useful tool and a good way of getting clarity on the decisions that need to be made.

A good decision record needs to capture four important elements:

Context

What is the challenge? What is the problem that we are trying to solve? What are the constraints? A decision record should give us a summary of these contextual elements. That way we can understand the rationale for a decision and why it may need to be updated.

Alternatives

A decision isn’t a decision unless there is a choice to be made. A good decision record should help us to understand what the choices are. This helps us to better understand the context and the “selection space” at the time the decision was made.

Choice

At the heart of a decision is the choice. Every decision record needs to document the choice that was made.

Impact

Decisions have consequences and a decision record should document the important ones. What are the trade-offs? How will our decision choice impact the way we work or other decisions that need to be made?

You can create decision records however you like. You can write them up as text files, use a project management tool, or even track them in a spreadsheet. The format and tooling is less important than the content. As long as you capture the areas we’ve described you’ll have a good decision record.

For our example project, we’ll use an existing format called a lightweight architectural decision record (LADR). The LADR format was created by Michael Nygard, and is a nice concise way of documenting a decision record. Let’s get to know LADR by building one together.

Tip

If you want to use something other than LADR, Joel Parker Henderson maintains a great list of ADR formats and templates.

Writing a Lightweight Architectural Decision Record

The first key decision we’ll record is the decision to keep a record of decisions. Put more simply, we’ll create an ADR that says we intend to keep track of our decisions. As we’ve mentioned, we’ll be using the LADR format. The nice thing about LADR is that it’s designed to be lightweight. It lets us keep track of decisions in simple text files that we can write quickly. Since we’re dealing with text files, we can even manage our decision records in the same way we manage source code.

LADRs are written using a text format called Markdown, which provides an elegant and simple way of writing documentation. What’s great about Markdown is that it’s easy for humans to read in its raw form and most popular tools know how to render it. For example, Confluence, GitLab, GitHub, and SharePoint can all process Markdown and present it as a formatted, human-readable document.

To create our first Markdown-based LADR, open your favorite text editor and start working on a new document. The first thing we’ll do is lay out the structure.

Add the following text to your LADR file:

# OPM1: Use ADRS for decision tracking

## Status
Accepted

## Context

## Decision

## Consequences

These are the key elements of our decision record. The # characters preceding the lines are Markdown tokens that will let the parser know that these lines are meant to be headings. Notice that we’ve given this decision a title that corresponds to the decision we’re making. We’ve also given the decision the slightly cryptic title: “OPM1.” This is just a short form code that will help us label and understand which part of the system the decision relates to. In this case, “OPM1” indicates that this is the first decision we’re recording related to the operating model.

The Status header of our record lets us know what life-cycle stage this decision is in. For example, if you’re drafting a new decision that you need to get agreement on, you might start with a status of Proposed. Or, if you’re considering changing an existing decision, you might change its status to Under Review. In our case, we’ve already made the decision for you, so we’ve set the status to Accepted.

The Context section describes the problem, constraints, and background for the decision being made. In our case, we want to capture the need to log important decisions and why that’s important. Add the following text (or your own variation of it) to the Context section of your record:

## Context
A microservices architecture is complex and we'll need to make many
decisions. We'll need a way to keep track of the important decision
we make, so that we can revisit and re-evalute them in the future.
We'd prefer to use a lightweight, text-based solution so that we
don't have to install any new software tools.

With the context in place, we can move on to recording the actual decision we’ve made. We can list some of the alternatives considered as well as our choice to use LADR. Add the following to the Decision section to document this fact:

## Decision
We've decided to use Michael Nygard's lightweight architectural
decision record (LADR) format. LADR is text based and is
lightweight enough to meet our needs. We'll keep each LADR record
in its own text file and manage the files like code.

We also considered the following alternative solutions:
* Project management tooling (not selected, because we didn't want
  to install tools)
* Informal or "word of mouth" record keeping (not reliable)

All that’s left is to document the consequences. In our case, one of the key consequences is that we’ll need to spend time actually documenting our decisions and managing the records. Let’s capture that as follows:

## Consequences
* We'll need to write decision records for key decisions
* We'll need a source code management solution to manage decision record files

That’s all it takes to write an LADR. This is an incredibly useful way of capturing your thinking and has the added benefit of forcing you to make rational, thoughtful decisions in the first place. As we build our example flights application, we’ll be keeping a log of the key decisions we make. To save time, we won’t write out the entire decision record. Instead we’ll highlight that a key decision has been made as in the following note.

You’ll be able to find a detailed version of each decision record at this book’s GitHub repository.

Summary

In this chapter we introduced some foundational concepts for this book. We provided a loose definition of a microservices system, including a set of three key traits. We identified the reduction of coordination costs as the key microservices benefit. We also explored how complexity and analysis paralysis present challenges to microservices adopters.

To help address these challenges, we introduced the “up and running” microservices model—an opinionated, prescriptive implementation that will accelerate the learning process for implementers. We covered the aspects of the model and the topics we’ll discuss. Finally, we introduced the concept of the architectural decision record (ADR) that we plan to use throughout the rest of the book.

With the overview out of the way, all that’s left is to build the system. We’ll kick things off in Chapter 2 by tackling how microservices work is done with a special focus on team coordination.

Get Microservices: Up and Running now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.