« Continued from Changes Needed

Migration Cookbook

Now that we’ve defined cloud-native application architectures and given a brief high-level overview of the changes enterprises must consider when adopting them, it’s time to delve into technical specifics. Each of these topics minimally merits its own chapter, and that’s beyond the scope of this report. Instead, this chapter provides a set of short, cookbook-style recipes to help with specific tasks and patterns needed to adopt a cloud-native application architecture, along with links to helpful further reading.

Decomposition Recipes

After discussing the decomposition of data, services, and teams with customers, I’m often asked, "Great! How do we get there from here?" Good question. How do we tear apart existing monoliths and move them to the cloud?

As it turns out, I’ve seen companies succeed with a fairly repeatable pattern of incremental migration which I now recommend to all of my customers. Publicly referenceable examples of this pattern can be found at SoundCloud and Karma.

In this section we’ll walk step-by-step through a series of recipes that provide a process for decomposing monolithic services and moving them to the cloud.

New Features as Microservices

Surprisingly, the first step is not to start chipping away at the monolith itself. We’ll begin with the assumption that you still have a backlog of features to be built within the monolith. In fact, if you don’t have any net new functionality to build, it’s arguable that you shouldn’t even be considering this decomposition. (Given that our primary motivation is speed, how do you leave something unchanged really fast?)

...the team decided that the best approach to deal with the architecture changes would not be to split the Mothership immediately, but rather to not add anything new to it. All of our new features were built as microservices...

Phil Calcado, SoundCloud

So it’s time to stop adding new code to the monolith. All new features will be built as microservices. Get good at this first, as building new services from scratch is far easier than surgically extracting them from a big ball of mud.

Inevitably, however, these new microservices will need to talk back to the monolith in order to get anything done. How do we attack that problem?

The Anti-Corruption Layer

Because so much of our logic was still in the Rails monolith, pretty much all of our microservices had to talk to it somehow.

Phil Calcado, SoundCloud

Domain-Driven Design (DDD), by Eric Evans (Addison-Wesley), discusses the idea of an anti-corruption layer. Its purpose is to allow the integration of two systems without allowing the domain model of one system to corrupt the domain model of the other. As you build new functionality into microservices, you don’t want these new services to become tightly coupled with the monolith by giving them deep knowledge of the monolith’s internals. The anti-corruption layer is a way of creating API contracts that make the monolith look like other microservices.

Evans divides the implementation of anti-corruption layers into three submodules, the first two representing classic design patterns (from Gamma et al., Design Patterns: Elements of Reusable Object-Oriented Software [Addison Wesley]):


The purpose of the facade module here is to simplify the process of integrating with the monolith’s interface. It’s very likely that the monolith was not designed with this type of integration in mind, so the facade’s purpose is to solve this problem. Importantly, it does not change the monolith’s model, being careful not to couple translation and integration concerns.


The adapter is where we define “services” that provide things our new features need. It knows how to take a request from our system, using a protocol that it understands, and make that request to the monolith’s facade(s).


The translator’s responsibility is to convert requests and responses between the domain model of the monolith and the domain model of the new microservice.

These three loosely coupled components solve three problems:

  1. System integration

  2. Protocol translation

  3. Model translation

What remains is the location of the communication link. In DDD, Evans discusses two alternatives. The first, facade to system, is primarily useful when you can’t access or alter the legacy system. Our focus here is on monoliths we do control, so we’ll lean toward Evans’ second suggestion, adapter to facade. Using this alternative, we build the facade into the monolith, allowing communications to occur between the adapter and the facade, as presumably it’s easier to create this link between two things written explicitly for this purpose.

Finally, it’s important to note that anti-corruption layers can facilitate two-way communication. Just as our new microservices may need to communicate with the monolith to accomplish work, the inverse may be true as well, particularly as we move on to our next phase.

Strangling the Monolith

After the architecture changes were made, our teams were free to build their new features and enhancements in a much more flexible environment. An important question remained, though: how do we extract the features from the monolithic Rails application called Mothership?

Phil Calcado, SoundCloud

I borrow the idea of “strangling the monolith” from Martin Fowler’s article entitled "StranglerApplication". In this article, Fowler explains the idea of gradually creating “a new system around the edges of the old, letting it grow slowly over several years until the old system is strangled.” We’re effectively going to do the same thing here. Through a combination of extracted microservices and additional anti-corruption layers, we’ll build a new cloud-native system around the edges of the existing monolith.

Two criteria help us choose which components to extract:

  1. SoundCloud nails the first criterion: identify bounded contexts within the monolith. If you’ll recall our earlier discussions of bounded contexts, they require a domain model that is internally consistent. It’s extremely likely that our monolith’s domain model is not internally consistent. Now it’s time to start identifying submodels that can be. These are our candidates for extraction.

  2. Our second criterion deals with priority: which of our candidates do we extract first? We can answer this by reviewing our first reason for moving to cloud-native architecture: speed of innovation. What candidate microservices will benefit most from speed of innovation? We obviously want to choose those that are changing the most given our current business needs. Look at the monolith’s backlog. Identify the areas of the monolith’s code that will need to change in order to deliver the changed requirements, and then extract the appropriate bounded contexts before making the desired changes.

Potential End States

How do we know when we are finished? There are basically two end states:

  1. The monolith has been completely strangled to death. All bounded contexts have been extracted into microservices. The final step is to identify opportunities to eliminate anti-corruption layers that are no longer necessary.

  2. The monolith has been strangled to a point where the cost of additional service extraction exceeds the return on the necessary development efforts. Some portions of the monolith may be fairly stable—we haven’t changed them in years and they’re doing their jobs. There may not be much value in moving these portions around, and the cost of maintaining the necessary anti-corruption layers to integrate with them may be low enough that we can take it on long-term.

Distributed Systems Recipes

As we start to build distributed systems composed from microservices, we’ll also encounter nonfunctional requirements that we don’t normally encounter when developing a monolith. Sometimes the laws of physics get in the way of solving these problems, such as consistency, latency, and network partitions. However, the problems of brittleness and manageability can normally be addressed through the proper application of fairly generic, boilerplate patterns. In this section we’ll examine recipes that help us with these concerns.

These recipes are drawn from a combination of the Spring Cloud project and the Netflix OSS family of projects.

Versioned and Distributed Configuration

We discussed the importance of proper configuration management for applications in Twelve-Factor Applications, which specifies the injection of configuration via operating system-level environment variables. This method is very suitable for simple systems, but as we scale up to larger systems, sometimes we want additional configuration capabilities:

  • Changing logging levels of a running application in order to debug a production issue

  • Change the number of threads receiving messages from a message broker

  • Report all configuration changes made to a production system to support regulatory audits

  • Toggle features on/off in a running application

  • Protect secrets (such as passwords) embedded in configuration

In order to support these capabilities, we need a configuration management approach with the following features:

  • Versioning

  • Auditability

  • Encryption

  • Refresh without restart

The Spring Cloud project contains a Config Server that provides these features. This Config Server presents application and application profile (e.g., sets of configuration that can be toggled on/off as a set, such as a “development” or “staging” profile) configuration as a REST API backed by a Git repository (Figure 3-1).

Config Server
Figure 3-1. The Spring Cloud Config Server

As an example, here’s the default application profile configuration for a sample Config Server (Example 3-1).

Example 3-1. Default application profile configuration for a sample Config Server
    "label": "",
    "name": "default",
    "propertySources": [
        "name": "https://github.com/mstine/config-repo.git/application.yml", 1
        "source": {
          "greeting": "ohai" 2

This configuration is backed by the file application.yml in the specified backing Git repository.


The greeting is currently set to ohai.

The configuration in Example 3-1 was not manually coded, but generated automatically. We can see that the value for greeting is being distributed to the Spring application by examining its /env endpoint (Example 3-2).

Example 3-2. Environment for a Config Server client
"configService:https://github.com/mstine/config-repo.git/application.yml": {
  "greeting": "ohai" 1

This application is receiving its greeting value of ohai from the Config Server.

All that remains is for us to be able to update the value of greeting without restarting the client application. This capability is provided by another Spring Cloud project module called Spring Cloud Bus. This project links nodes of a distributed system with a lightweight message broker, which can then be used to broadcast state changes such as our desired configuration change (Figure 3-2).

Simply by performing an HTTP POST to the /bus/refresh endpoint of any application participating in the bus (which should obviously be guarded with appropriate security), we can instruct all applications on the bus to refresh their configuration with the latest available values from the Config Server.

The Spring Cloud Bus
Figure 3-2. The Spring Cloud Bus

Service Registration/Discovery

As we create distributed systems, our code’s dependencies cease to be a method call away. Instead, we must make network calls in order to consume them. How do we perform the necessary wiring to allow all of the microservices within a composed system to communicate with one another?

A common architecture pattern in the cloud (Figure 3-3) is to have frontend (application) and backend (business) services. Backend services are often not accessible directly from the Internet but are rather accessed via the frontend services. The service registry provides a listing of all services and makes them available to frontend services through a client library (Routing and Load Balancing) which performs load balancing and routing to backend services.

Service registration and discovery
Figure 3-3. Service registration and discovery

We’ve solved this problem before using various incarnations of the Service Locator and Dependency Injection patterns, and service-oriented architectures have long employed various forms of service registries. We’ll employ a similar solution here by leveraging Eureka, which is a Netflix OSS project that can be used for locating services for the purpose of load balancing and failover of middle-tier services. Consumption of Eureka is further simplified for Spring applications via the Spring Cloud Netflix project, which provides a primarily annotation-based configuration model for consuming Netflix OSS services.

An application leveraging Spring Boot can participate in service registration and discovery simply by adding the @EnableDiscoveryClient annotation (Example 3-3).

Example 3-3. A Spring Boot application with service registration/discovery enabled
@EnableDiscoveryClient 1
public class Application {

  public static void main(String[] args) {
    SpringApplication.run(Application.class, args);


The @EnableDiscoveryClient enables service registration/discovery for this application.

The application is then able to communicate with its dependencies by leveraging the DiscoveryClient. In Example 3-4, the application looks up an instance of a service registered with the name PRODUCER, obtains its URL, and then leverages Spring’s RestTemplate to communicate with it.

Example 3-4. Using the DiscoveryClient to locate a producer service
DiscoveryClient discoveryClient; 1

public String consume() {
  InstanceInfo instance = discoveryClient.getNextServerFromEureka("PRODUCER", false); 2

  RestTemplate restTemplate = new RestTemplate();
  ProducerResponse response = restTemplate.getForObject(instance.getHomePageUrl(), ProducerResponse.class);

  return "{\"value\": \"" + response.getValue() + "\"}";

The enabled DiscoveryClient is injected by Spring.


The getNextServerFromEureka method provides the location of a service instance using a round-robin algorithm.

Routing and Load Balancing

Basic round-robin load balancing is effective for many scenarios, but distributed systems in cloud environments often demand a more advanced set of routing and load balancing behaviors. These are commonly provided by various external, centralized load balancing solutions. However, it’s often true that such solutions do not possess enough information or context to make the best choices for a given application as it attempts to communicate with its dependencies. Also, should such external solutions fail, these failures can cascade across the entire architecture.

Cloud-native solutions often often shift the responsibility for making routing and load balancing solutions to the client. One such client-side solution is the Ribbon Netflix OSS project (Figure 3-4).

Figure 3-4. Ribbon client-side load balancer

Ribbon provides a rich set of features including:

  • Multiple built-in load balancing rules:

    • Round-robin

    • Average response-time weighted

    • Random

    • Availability filtered (avoid tripped circuits or high concurrent connection counts)

  • Custom load balancing rule plugin system

  • Pluggable integration with service discovery solutions (including Eureka)

  • Cloud-native intelligence such as zone affinity and unhealthy zone avoidance

  • Built-in failure resiliency

As with Eureka, the Spring Cloud Netflix project greatly simplifies a Spring application developer’s consumption of Ribbon. Rather than injecting an instance of DiscoveryClient (for direct consumption of Eureka), developers can inject an instance of LoadBalancerClient, and then use that to resolve an instance of the application’s dependencies (Example 3-5).

Example 3-5. Using the LoadBalancerClient to locate a producer service
LoadBalancerClient loadBalancer; 1

public String consume() {
  ServiceInstance instance = loadBalancer.choose("producer"); 2
  URI producerUri = URI.create("http://${instance.host}:${instance.port}");

  RestTemplate restTemplate = new RestTemplate();
  ProducerResponse response = restTemplate.getForObject(producerUri, ProducerResponse.class);

  return "{\"value\": \"" + response.getValue() + "\"}";

The enabled LoadBalancerClient is injected by Spring.


The choose method provides the location of a service instance using the currently enabled load balancing algorithm.

Spring Cloud Netflix further simplifies the consumption of Ribbon by creating a Ribbon-enabled RestTemplate bean which can be injected into beans. This instance of RestTemplate is configured to automatically resolve instances of logical service names to instance URIs using Ribbon (Example 3-6).

Example 3-6. Using the Ribbon-enabled RestTemplate
RestTemplate restTemplate; 1

public String consume() {
  ProducerResponse response = restTemplate.getForObject("http://producer", ProducerResponse.class); 2
  return "{\"value\": \"" + response.getValue() + "\"}";

RestTemplate is injected rather than a LoadBalancerClient.


The injected RestTemplate automatically resolves http://producer to an actual service instance URI.


Distributed systems have more potential failure modes than monoliths. As each incoming request must now potentially touch tens (or even hundreds) of different microservices, some failure in one or more of those dependencies is virtually guaranteed.

Without taking steps to ensure fault tolerance, 30 dependencies each with 99.99% uptime would result in 2+ hours downtime/month (99.99%^30^ = 99.7% uptime = 2+ hours in a month).

Ben Christensen,
Netflix Engineer

How do we prevent such failures from resulting in the type of cascading failures that would give us such negative availability numbers? Mike Nygard documented several patterns that can help in his book Release It! (Pragmatic Programmers), including:

Circuit breakers

Circuit breakers insulate a service from its dependencies by preventing remote calls when a dependency is determined to be unhealthy, just as electrical circuit breakers protect homes from burning down due to excessive use of power. Circuit breakers are implemented as state machines (Figure 3-5). When in their closed state, calls are simply passed through to the dependency. If any of these calls fails, the failure is counted. When the failure count reaches a specified threshold within a specified time period, the circuit trips into the open state. In the open state, calls always fail immediately. After a predetermined period of time, the circuit transitions into a “half-open” state. In this state, calls are again attempted to the remote dependency. Successful calls transition the circuit breaker back into the closed state, while failed calls return the circuit breaker to the open state.

Circuit Breaker
Figure 3-5. A circuit breaker state machine

Bulkheads partition a service in order to confine errors and prevent the entire service from failing due to failure in one area. They are named for partitions that can be sealed to segment a ship into multiple watertight compartments. This can prevent damage (e.g., caused by a torpedo hit) from causing the entire ship to sink. Software systems can utilize bulkheads in many ways. Simply partitioning into microservices is our first line of defense. The partitioning of application processes into Linux containers (Containerization) so that one process cannot takeover an entire machine is another. Yet another example is the division of parallelized work into different thread pools.

Netflix has produced a very powerful library for fault tolerance in Hystrix that employs these patterns and more. Hystrix allows code to be wrapped in HystrixCommand objects in order to wrap that code in a circuit breaker.

Example 3-7. Using a HystrixCommand object
public class CommandHelloWorld extends HystrixCommand<String> {

    private final String name;

    public CommandHelloWorld(String name) {
        this.name = name;

    protected String run() { 1
        return "Hello " + name + "!";

The code in the run method is wrapped with a circuit breaker

Spring Cloud Netflix adds an @EnableCircuitBreaker annotation to enable the Hystrix runtime components in a Spring Boot application. It then leverages a set of contributed annotations to make programming with Spring and Hystrix as easy as the earlier integrations we’ve described (Example 3-8).

Example 3-8. Using @HystrixCommand
RestTemplate restTemplate;

@HystrixCommand(fallbackMethod = "getProducerFallback") 1
public ProducerResponse getProducerResponse() {
  return restTemplate.getForObject("http://producer", ProducerResponse.class);

public ProducerResponse getProducerFallback() { 2
  return new ProducerResponse(42);

The method annotated with @HystrixCommand is wrapped with a circuit breaker.


The method getProducerFallback is referenced within the annotation and provides a graceful fallback behavior while the circuit is in the open or half-open state.

Hystrix is unique from many other circuit breaker implementations in that it also employs bulkheads by operating each circuit breaker within its own thread pool. It also collects many useful metrics about the circuit breaker’s state, including:

  • Traffic volume

  • Request rate

  • Error percentage

  • Hosts reporting

  • Latency percentiles

  • Successes, failures, and rejections

These metrics are emitted as an event stream which can be aggregated by another Netflix OSS project called Turbine. Individual or aggregated metric streams can then be visualized using a powerful Hystrix Dashboard (Figure 3-6), providing excellent visibility into the overall health of the distributed system.

Hystrix Dashboard
Figure 3-6. Hystrix Dashboard showing three sets of circuit breaker metrics

API Gateways/Edge Services

In Mobile Applications and Client Diversity we discussed the idea of server-side aggregation and transformation of an ecosystem of microservices. Why is this necessary?


Mobile devices typically operate on lower speed networks than our in-home devices. The need to connect to tens (or hundreds?) of microservices in order to satisfy the needs of a single application screen would reduce latency to unacceptable levels even on our in-home or business networks. The need for concurrent access to these services quickly becomes clear. It is less expensive and error-prone to capture and implement these concurrent patterns once on the server-side than it is to do the same on each device platform.

A further source of latency is response size. Web service development has trended toward the “return everything you might possibly need” approach in recent years, resulting in much larger response payloads than is necessary to satisfy the needs of a single mobile device screen. Mobile device developers would prefer to reduce that latency by retrieving only the necessary information and ignoring the remainder.

Round trips

Even if network speed was not an issue, communicating with a large number of microservices would still cause problems for mobile developers. Network usage is one of the primary consumers of battery life on such devices. Mobile developers try to economize on network usage by making the fewest server-side calls possible to deliver the desired user experience.

Device diversity

The diversity within the mobile device ecosystem is enormous. Businesses must cope with a growing list of differences across their customer bases, including different:

  • Manufacturers

  • Device types

  • Form factors

  • Device sizes

  • Programming languages

  • Operating systems

  • Runtime environments

  • Concurrency models

  • Supported network protocols

This diversity expands beyond even the mobile device ecosystem, as developers are now targeting a growing ecosystem of in-home consumer devices including smart televisions and set-top boxes.

The API Gateway pattern (Figure 3-7) is targeted at shifting the burden of these requirements from the device developer to the server-side. API gateways are simply a special class of microservices that meet the needs of a single client application (such as a specific iPhone app), and provide it with a single entry point to the backend. They access tens (or hundreds) of microservices concurrently with each request, aggregating the responses and transforming them to meet the client application’s needs. They also perform protocol translation (e.g., HTTP to AMQP) when necessary.

API Gateway pattern
Figure 3-7. The API Gateway pattern

API gateways can be implemented using any language, runtime, or framework that well supports web programming, concurrency patterns, and the protocols necesssary to communicate with the target microservices. Popular choices include Node.js (due to its reactive programming model) and the Go programming language (due to its simple concurrency model).

In this discussion we’ll stick with Java and give an example from RxJava, a JVM implementation of Reactive Extensions born at Netflix. Composing multiple work or data streams concurrently can be a challenge using only the primitives offered by the Java language, and RxJava is among a family of technologies (also including Reactor) targeted at relieving this complexity.

In this example we’re building a Netflix-like site that presents users with a catalog of movies and the ability to create ratings and reviews for those movies. Further, when viewing a specific title, it provides recommendations to the viewer of movies they might like to watch if they like the title currently being viewed. In order to provide these capabilities, three microservices were developed:

  • A catalog service

  • A reviews service

  • A recommendations service

The mobile application for this service expects a response like that found in Example 3-9.

Example 3-9. The movie details response
    "mlId": "1",
    "recommendations": [
            "mlId": "2",
            "title": "GoldenEye (1995)"
    "reviews": [
            "mlId": "1",
            "rating": 5,
            "review": "Great movie!",
            "title": "Toy Story (1995)",
            "userName": "mstine"
    "title": "Toy Story (1995)"

The code found in Example 3-10 utilizes RxJava’s Observable.zip method to concurrently access each of the services. After receiving the three responses, the code passes them to the Java 8 Lambda that uses them to create an instance of MovieDetails. This instance of MovieDetails can then be serialized to produce the response found in Example 3-9.

Example 3-10. Concurrently accessing three services and aggregating their responses
Observable<MovieDetails> details = Observable.zip(


  (movie, reviews, recommendations) -> {
    MovieDetails movieDetails = new MovieDetails();
    return movieDetails;

This example barely scratches the surface of the available functionality in RxJava, and the reader is invited to explore the library further at RxJava’s wiki.


In this chapter we walked through two sets of recipes that can help us move toward a cloud-native application architecture:


We break down monolithic applications by:

  1. Building all new features as microservices.

  2. Integrating new microservices with the monolith via anti-corruption layers.

  3. Strangling the monolith by identifying bounded contexts and extracting services.

Distributed systems

We compose distributed systems by:

  1. Versioning, distributing, and refreshing configuration via a configuration server and management bus.

  2. Dynamically discovering remote dependencies.

  3. Decentralizing load balancing decisions.

  4. Preventing cascading failures through circuit breakers and bulkheads.

  5. Integrating on the behalf of specific clients via API Gateways.

Many additional helpful patterns exist, including those for automated testing and the construction of continuous delivery pipelines. For more information, the reader is invited to read "Testing Strategies in a Microservice Architecture" by Toby Clemson and Continuous Delivery: Reliable Software Releases through Build, Test, and Deployment Automation by Jez Humble and David Farley (Addison-Wesley).

Article image: atelier bow-wow, sectional model of house tower, tokyo 2006 (source: SEIER+SEIER).