Achieving cloud-native operability with microservices, DevOps, and continuous delivery

A sustainable approach that scales with your organization and software.

By Casey West

April 21, 2016

Glass window (source: Pixabay)

Software developers like to make lists of three things and then say, smugly, “Pick any two.” For example, the Project Management triangle is good, fast, cheap (pick any two), and the CAP Theorem offers consistency, availability, and partition tolerance (pick any two).

When working with cloud-native software, you don’t have the luxury of picking just two items from the trio of microservices, DevOps, and continuous delivery; you need all three. Microservices is the architecture, DevOps—specifically CALMS (collaboration, automation, learning, measuring, and sharing)—is the culture, and continuous delivery is the process. All three aspects are required to achieve consistent throughput with low risk in your software-delivery process.

Learn faster. Dig deeper. See farther.

Join the O'Reilly online learning platform. Get a free trial today and find answers on the fly, or master something new and useful.

Learn more

If you have optimized your software-development lifecycle and your delivery pipeline by adopting the tenets of cloud-native operability, that’s great. However, if your process for releasing and managing software is still filled with bespoke scripting, manual labor, and late-night outages, then you aren’t realizing the true potential of cloud-native operability. To realize that potential, you need an operationally mature platform.

Your platform is whatever environment manages the software you deliver, so by this definition everyone has a platform. The vital question is how mature is your platform? I’ve developed a simple rubric for evaluating the operational maturity of production environments that I call the Minimum Viable Platform. Here is that rubric—the minimum capabilities your production environment should exhibit to be considered an operationally mature platform (as Martin Fowler says, “You must be this tall to use microservices”):

Dynamic DNS, routing, and load balancing. No human intervention should be required to ship a new service, make it available to its users, or horizontally scale it.
Automated service discovery and brokering. Your environment should automatically provision new backing services, as well as authenticate and authorize new and existing backing services. This includes generating and managing credentials and network access; no humans should be generating, encrypting, or distributing this information.
Infrastructure automation. Having the API to programmatically manage storage, network, and computing resources isn’t enough. The automation to manage them should be baked into the environment your application is delivered to.
Health management, monitoring, and recovery. You should automate the health and well-being of your applications by building on mature infrastructure automation. If a host or application instance dies, the platform should notice and recover automatically.
Immutable artifact repository. Starting an app, or scaling it out, should be fast. To be fast you need to separate the build, deploy, and run phases of the application’s lifecycle. A repository of immutable artifacts, like containers or packages, allows for fast startup times (for elasticity) and ensures that the exact version of the application you built once will be deployed and run consistently everywhere.
Log aggregation. No one should need to SSH into a machine to find logs—that doesn’t scale. Running thousands of application instances across hundreds of nodes requires distributed log aggregation to produce a reliable fire hose that you can plug into analytics and event-management software. (This criteria may feel obvious, but well-managed, distributed, real-time log aggregation is much harder than it seems.)

You may need a few more capabilities for your specific context than are listed above, but you won’t need fewer. Every team I share this rubric with says that they are on the path to acquiring these capabilities by building, buying, or gluing together components.

Even with a mature platform as your foundation, delivering software as a distributed system is still challenging. Microservices emerged as an architectural practice, but using microservices doesn’t make it easier to build software. In fact, introducing the CAP theorem between every logical component (or service) in your architecture is more complex than not doing so. While microservices don’t make your software (or its scale) easier to manage, they do help your organization grow by building services along natural seams in your architecture, and by keeping your teams under the “two-pizza limit.” That’s how you get maximum value from microservices.

However, if you attempt a microservices approach without the DevOps culture of CALMS (see above), you will struggle. The increased technical complexity that comes with delivering microservices requires effective coordination and adaptive learning between teams. In other words, the way you build software is just as important as what you build. As Conway’s Law suggests, poor communication and structures within your organization will manifest themselves in the software you make. Culture is critically important to the success of microservices.

Likewise, if you currently manage just a few services and changing them is a painful process, that problem won’t get better as the number of services you manage increases. Automating your delivery pipeline is essential to a microservices approach, and often a natural byproduct of a healthy DevOps culture. If a cumbersome delivery process for monolithic software hurts a little, continuing that practice as you transition to microservices will cause death by a thousand cuts.

Your platform is what you rely on to deliver and manage cloud-native software. Microservices, DevOps, and continuous delivery offer a sustainable approach that scales with your organization and software. These three practices, supported by a solid platform, are how you achieve cloud-native operability.

This post is a collaboration between O’Reilly and Pivotal. See our statement of editorial independence.

Post topics: Operations