Chapter 4. What Does Cloud Native Look Like?

So far, we’ve looked at what .NET apps you’re running today, and what you’ve been asked to build tomorrow. What is the one constant running through almost every request you now get? Make the software more scalable, more adaptable to change, more tolerant of failure, and more manageable. That’s the essence of what it means to be “cloud-native.” In this chapter, we look at the ideas behind cloud-native architectures, and why it matters to your .NET applications.

Defining Cloud Native

You’ll find many different definitions of cloud native. The charter of the Cloud Native Computing Foundation states that cloud-native systems are “container packed,” “dynamically managed,” and “micro-service oriented.” That’s too implementation centric for my taste, but its official definition of cloud-native is more on point:

Cloud native technologies empower organizations to build and run scalable applications in modern, dynamic environments such as public, private, and hybrid clouds. Containers, service meshes, microservices, immutable infrastructure, and declarative APIs exemplify this approach.

These techniques enable loosely coupled systems that are resilient, manageable, and observable. Combined with robust automation, they allow engineers to make high-impact changes frequently and predictably with minimal toil.

Pivotal uses a definition along those lines:

Cloud-native is an approach to building and running applications that exploits the advantages of the cloud computing delivery model. Cloud-native is about how applications are created and deployed, not where.

Joe Beda, one of the creators of Kubernetes, takes it a step further:

At its root, Cloud Native is structuring teams, culture and technology to utilize automation and architectures to manage complexity and unlock velocity.

There’s truth in all of these. But what you should take away from these definitions is that cloud native refers to how, not where. It’s about achieving better business outcomes through empowered teams that deliver more scalable, resilient, operable software.

Why Cloud-Native Matters

Why should your .NET apps be cloud-native? Does it really matter and is it worth the effort? What it boils down to is getting better at software: designing it, building it, running it.

For the purpose of this discussion, let’s equate “good at software” with “delivering cloud-native software.” By the end of this chapter, I hope you’ll agree with me.

Let’s look at five reasons why you need to be good at software.

Customers Expect It

You know what’s not cool anymore? Maintenance windows. Or annual software releases. Sluggish performance? Can’t have that. No, you expect every business you deal with—whether it’s your bank, neighborhood social network, streaming media provider, or your own employer—to deliver digital experiences that are always available, constantly updated, secure by default, and blazingly fast. That’s virtually impossible to achieve with software we wrote 10 years ago!

It Helps You Meet the Demands to Operate at Scale

If becoming cloud natives requires us to create more software and machinery to support it, we absolutely need to evolve our approach to operations. Organizations want to flip their spending ratio and invest more on innovation and less on maintaining. Noble goal! That cannot happen without doubling down on automation and reducing toil. And you can’t introduce massive amounts of automation without a culture, team structure, codebase, and platform that accommodates it.

It Gives You More Business Options

When you’re good at software, all of a sudden you have fresh opportunities. Telecommunications companies can react quickly to unmet need by reconfiguring mobile data plans. Automobile companies can start ride-sharing or rental services. Manufacturing companies have the choice to sell machine data to third parties. And companies in all sectors can expand into new markets, run quick experiments, and find new revenue streams—all possible, and likely, when you get good at software.

Your Competitors Are Improving

It’s rare to find pure monopolies today. Most of us have a choice of who we do business with in most aspects of our lives. This consumer control puts companies on notice: if you don’t give me the service I want, I’ll switch to someone who will! Your alternative might be a traditional competitor, customer-centric startup, or, increasingly, an internet giant with deep pockets and an eye for expansion. If you don’t learn how to deliver valuable software that meets or exceeds expectations, you’ll enter an irreversible decline.

It Makes Your Life Better

If you want, ignore all the preceding arguments. If for no other reason, improve your software acumen so that you enjoy work again. This is the best time ever to build software. Never have we been able to do so much with so little (relative) effort. If you’re miserable at work, something’s wrong. Become a cloud native so that you can offload mind-numbing operational tasks and get regular shots of dopamine from seeing people use the software you’ve built. Ship early and often, and use platforms that “run” the software effectively.

Characteristics of Cloud-Native Apps

Okay, I have you hooked. This all sounds great, you say. Can I just containerize my app and it will become cloud native? No, no you can’t. Let’s talk about what a cloud-native app looks like.

They Meet the 15-factor Criteria

How you package your software doesn’t make it cloud native. It’s about the software itself! The now-famous 12-factor criteria calls out traits for scalable, modern apps. It includes items like explicitly declared dependencies, stateless processes, scale out versus scale up, fast startup and graceful shutdown, and treating logs as event streams. As my former colleague Kevin Hoffman says in his book Beyond the 12 Factor App (O’Reilly):

The goal of these 12 factors was to teach developers how to build cloud-ready applications that had declarative formats for automation and setup, had a clean contract with the underlying operating system, and were dynamically scalable.

Kevin added three factors to the standard list: API-first design, heavy use of telemetry, and security through authentication and authorization. Even though you don’t need to blindly adhere to all 15 factors, the more of them that you comply with, the more cloud-ready your software will be.

They’re Decoupled and Designed for Change

Microservices: you can’t stop hearing about them! Let’s ignore the hype. In reality, the point of this architectural paradigm is to decompose hard-to-change monolithic systems. Those decomposed components, or microservices, typically align with a business domain. A microservice isn’t defined by how many lines of code it contains, but by its single-purpose focus.

As you decompose systems into these bounded contexts, you get some benefits:

  • You now have a smaller deployment surface. Make targeted changes, and deploy each change without bundling up the entire system.

  • Microservices help you to scale your teams. Instead of all engineers working on a set of interwoven components, smaller teams can narrow their focus and work on the change cycle that works for them.

  • By teasing apart your system, you get the opportunity to introduce new technologies with a limited blast radius. Maybe the entire system doesn’t need to use a document database instead of a relational one, but it makes sense for this particular microservice.

  • Microservices add value by supporting smarter deployments. Instead of coarsely scaling the entire system up or down, you have the choice of surgically scaling stressed components. This lets you keep an optimized infrastructure footprint and avoid adding capacity where it isn’t needed.

To be fair, a microservices architecture isn’t always the answer. You might be better off with a modular monolith, as mentioned in Chapter 2. But if you go down the path of microservices as a way to achieve the cloud agility you’re after, there’s a lot to consider. We discuss the specifics in upcoming chapters, but you’ll have a new series of questions to answer. What’s a repeatable way to uncover the boundaries of a service? How do I discover services at runtime? How can I avoid cascading failures? Is my current monitoring strategy set up for an explosion of things to monitor? Where do I start troubleshooting? Stay tuned for the answers.

They’re Continuously Delivered

If your software is continuously delivered, you might be a cloud native. Unlike continuous deployment—in which changes are automatically pushed to production—continuous delivery is about going to production whenever you want. You might stage deployments for business reasons, but the current version is ready at any time.

What does it take to get here? A fair bit. It all begins with tests. My Pivotal colleagues laid this out. To go fast (through continuous delivery), you need clean code. Bad code slows you down. To achieve clean code, you need to constantly refactor. To be brave enough to constantly refactor, you need confidence that you won’t break your running software. To have confidence, you need tests.

A continuously integrated/continuously delivered (CI/CD) culture affects more than just your software team. It requires buy-in throughout the company. Cloud natives have that. There’s an institutional imperative to get value into the hands of customers as quickly as possible. At many enterprises, this is a fundamental shift. It changes how you fund IT, how you arrange teams, what skills you hire for, how marketing delivers the message, and so much more. But this improvement in responsiveness is game-changing for every company that employs it.

They’re Built and Run by Empowered Teams

This is where DevOps comes into the picture. And not the watered-down version of DevOps in which you just rename your release engineering team or add some fancy monitoring dashboards. No, I’m talking about a singular focus on customer value. The result? You have an aligned team that includes all the skills needed to design, build, and run the “service” offered to customers. You focus on small batches and regular releases so that you can quickly learn and improve the service. Your teams swarm on production issues, fix issues without spraying blame, and thoughtfully consider how to prevent that issue from happening again.

Cloud natives do this. They don’t have a software factory where the work product is handed between silos. They don’t have production support teams responsible for dozens of individual systems. And they don’t arrange their work around IT projects. There’s no doubt that this model represents a change in how most big companies operate today. But the name of the game is “who learns from customers the fastest,” and the way to win is to organize and empower your teams to focus on the customer experience.

They’re Resilient in the Face of Failure

Everything fails. You can’t prevent hardware, networks, software, or facilities from going down or experiencing disruptions. It will happen. Cloud-native apps laugh in the face of failure! They not only expect failure, they purposely inject it into the system to see what happens.

Cloud natives create software services that stay online in virtually all circumstances. Do they do that by provisioning premium hardware and gold-plated databases? No. Frankly, they do it with commodity hardware and custom-built or open source software. But the key is how they use that technology. It’s about smart redundancy. They use modern databases (and caches) that tolerate network partitions and scale rapidly. These cloud natives use well-instrumented systems and ubiquitous automation to detect problems and respond immediately. And even if all those things fail, they deploy via automation so that they can stand up cloned environments in short order.

The other resilience angle relates to intentionally trying to break things. Even in production. Chaos engineering is about “experiments to uncover systemic weaknesses.” This software engineering discipline is about continuous improvement and recognizing that in complex distributed systems, we need to constantly probe for weaknesses.

Thinking Beyond “Apps” for Cloud-Native Software

Thus far, we’ve looked at cloud-native applications. There’s more to the story than that. I often consider at least four other areas where cloud-native comes into play: infrastructure, security, data, and integration. If you leave these out of your strategy, you’ll find that you’re still experiencing a constraint that limits your velocity and quality.

Can you have cloud-native infrastructure? Sure you can. Cloud-native infrastructure, as defined in the book of the same name, is “hidden by useful abstractions, controlled by APIs, managed by software, and has the purpose of running applications.” This is about software-controlled infrastructure that results in more consistent provisioning, improved resilience, and simpler maintainability. Using a public cloud doesn’t automatically mean that you’re using a cloud-native infrastructure approach. Not if you log in to individual machines, build servers via tickets and portals, and colocate all your apps on a few giant servers.

Your existing security strategy might not survive a cloud-native transformation. Cloud-native security reflects the fact that you need to “move fast to stay safe,” as Pivotal Chief Security Officer Justin Smith likes to say. Malware and advanced persistent threats are evolving faster than ever. Leaked credentials continue to cause major issues. And the monitoring-centric approach isn’t good enough. It’s time to become more proactive. At Pivotal, we talk about the 3 Rs:

  • Quickly repair vulnerable software and infrastructure. In the cloud, that might mean being able to patch multiple times per day.

  • Repave your infrastructure constantly to eliminate hiding places for malware and stay in a consistent, patched state.

  • Finally, rotate credentials regularly. Shrink the amount of time that credentials are useful. All of these combine to reduce your risk in a cloud-native world.

You won’t achieve your desired outcomes if you transform how you build apps but keep the same data strategy. You need a cloud-native data approach. Your databases and data processing must be biased toward changeabilty, scalability, resilience, and manageability. That means offering different types of databases—relational, key/value, document, graph, caches, and more—for different microservices. When you start having databases scoped to a given microservice, you need to rethink how you provision, update, and manage all these instances. How you collect, transmit, store, and interact will change. Be ready!

After you “solve” the throughput issues related to infrastructure, security, apps, and data, you’ll find one more holdout: integration. Much of my career was spent in the application integration space. I saw most companies invest in centers of excellence with expert resources who programmed complex, powerful integration products. The problem? Those teams (and tools) become bottlenecks. If everything can’t be continuously delivered, I can move only as fast as my constraint. To rethink how you connect systems together, you’ll want to introduce cloud-native integration. Wherever possible, integration should be self-service, distributed (not centralized), built to scale, open to changes, and delivered via automation. That’s not an easy task, but it’s one that will pay real dividends.

Measuring Your Progress Toward Becoming Cloud Native

Are you actually getting better at building software? Are you functioning as a cloud native yet? How you answer that question is critical. Measure your progress through outcomes, not output. Just as “lines of code” doesn’t mean you’re a more productive software developer, neither does “deploys per day” necessarily mean you’re improving in the right ways. What is a useful mental model for measuring your progress?

At Pivotal, we talk about 5 S’s: speed, stability, scalability, security, and savings. Are you learning and responding faster? That’s speed. You can measure your improvement in lead time—the time from order/request to final delivery. If you are getting ideas and bug fixes to production faster, that’s a tangible thing to measure. Here’s one key metric: how many apps are on pipelines. That’s an indicator that you can quickly deploy code. Stability? Keep an eye on uptime, and resilience in the face of failures. High performers have a constantly improving mean time to recovery. Customers see less downtime, even if underlying components stumble. Companies observe scalability improvement in a few places. Individual systems and services handle increasing traffic with consistently low latency. Measure the time it takes to add or remove capacity in seconds, not weeks or months. What are the right security metrics to monitor? Consider how long it takes to patch apps and infrastructure. Or what percentage of apps and infrastructure are 100% up to date on patches. And don’t forget about how long your servers live, or how often you cycle credentials.

Finally, if you’re good at software, you’re going to save money. Oh, you might find yourself spending more money because you create new computing environments and write more software. But the cost per unit decreases as you automate infrastructure, deliver work incrementally, and build in security up front.

Summary

There’s even more to consider here. For an exceptional look at how to measure the right things in your software transformation, pick up the book Accelerate by Nicole Forsgren and team. It will definitely help you focus your attention in the right places. In Chapter 5, we take a closer look at how you choose between the .NET Framework and .NET Core for your cloud-native software.

Get Modernizing .NET Applications now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.