Chapter 4. Preventing Risk Early

Oh, East is East and West is West, and never the twain shall meet.

Rudyard Kipling

The twain shan’t meet, but collaborate.

Anonymous

An ounce of prevention is worth a pound of cure.

Benjamin Franklin

Never was so much owed by so many to so few.

The Right Honorable Sir Winston Churchill

“More meetings?” came the cry. “Please, no more meetings!”

It was a valid plea. The call had gone out for the development teams to work more closely with the security teams, and our work calendars had just been filled with architecture and design reviews, security audits, code security reviews, gates, and requests for our planned future release plans. There was starting to be very little time left to actually write any code.

We all knew security was important, critical even. But how security happened, how it was brought to and actioned by development teams, was the question. More meetings couldn’t be the way. Nor could a cacophony of new reviews. There had to be a better way. There had to be.

And there is. In this chapter, you’re going to see how your CNAPP is the collaborative glue between your development teams and your security-focussed teams, helping you get ahead of the game from the very first moments of your software development lifecycle, as shown in Figure 4-1.

Figure 4-1. The focus of this chapter is on how your CNAPP can bring security awareness to your development teams and the resources they are responsible for

The Interface Between Security and Development Work

The interface between development and security used to be vexing for everyone. Too little, too late; too much theater, too little impact; zero collaboration, tons of conflict. The unspoken belief was that developers like change and risk, and security engineers like stability and no risk, and that these polar opposites then try to ignore one another, occasionally butting heads in an incident or project planning session.

The goals of security and development look fundamentally opposed. The development teams are responsible and accountable for delivering high-quality functionality quickly. The security folks seem to stand in the way of that. The perception was that developers used tools such as gates and reviews to make security the reason why they couldn’t continuously deliver, and might not be able to deliver at all, as shown in Figure 4-2.

Figure 4-2. The friction-filled relationship between security-focussed teams and development teams

The good news was that we’d seen this before. There used to be two other groups of people that stood in the way of delivery, facing off in a toxic relationship with the development teams. Testing and quality assurance (QA) teams and operations teams used to also take the formidably difficult stance of gatekeepers on production, and nothing was going to get past them if it didn’t satisfy their requirements. If it didn’t get their approval, or their signoff, then you’d end up with the situation shown in Figure 4-3.

Figure 4-3. The friction-filled old interface between the goals of development teams and QA and operations teams

But that had gradually changed. Testing and QA roles and perspectives had been brought earlier and earlier into the software development lifecycle, moving QA from being a post-development pain to being the driver of the development itself through automated Acceptance Test-Driven Development and practices such as Specification by Example. Embedding quality assurance practices into the development team’s process unleashes the possibility of continuous delivery, as confidence can be continually built and maintained in the quality of what is being produced.

Then, DevOps was born, and a similar effect removed the operations silo’s negative effects. DevOps encouraged close collaboration between the development teams and the operations teams, sometimes even merging the two roles and sets of responsibilities into autonomous delivery teams responsible for continuously delivering and operating their systems, and accountable for their success, as shown in Figure 4-4.

Figure 4-4. The roles and responsibilities for assuring quality and operating the systems worked closer and closer together until, in some cases, a multi-functional, autonomous team could be responsible for writing and running their own systems

To do the same for security, similar lessons could be applied to those that won out for testing, QA, and operations. Perhaps it wouldn’t be as clean as embedding security people in development teams, but maybe a more collaborative relationship could be established. Maybe a new mode of interaction between security and development could be achieved.

DevSecOps was born, as discussed in Chapter 1, and it required development and security teams to work more closely than ever before.

Comparing the Developer and Security Domain Languages

The first step to help your security and development teams work together is to help them understand one another. Both groups need to understand that they don’t just have different goals; they use different languages. There is a ubiquitous language of security that is populated with security policies, security threats, threat models, and arcane references to compliance standards, and there is a ubiquitous language of development that is full of objects, functions, containers, orchestrators, services, and service interfaces, not to mention business domain knowledge, too.

In Domain-Driven Design, these two distinct ubiquitous languages are helpfully placed within two bounded contexts. A bounded context simply draws a line around people who work close enough together to use the same language, syntax, and semantics, as shown in Figure 4-5.

To help two teams or groups of people, security and development in our case, to work together successfully, you first need to know that they use different languages to describe things. Security and development have their own bounded contexts.

Figure 4-5. A sample of the different ubiquitous languages of security-focussed and development-focussed teams

The next step is to find a way to successfully bridge across those bounded contexts. You need a way to adapt concepts in both bounded contexts so that the two groups can communicate, and you need to set up the protocols and mechanisms so that they can effectively work together. You need to construct anti-corruption layers and team interaction modes.

CNAPP as an Anti-Corruption Layer

I once asked a relationship counselor what was the most difficult part of their job. Given they are dropped between two people who are trying to repair a human relationship that is under stress, what was the biggest lesson they could learn from that?

“It’s a miracle two people can communicate, period,” was their answer.

Given two examples of the most complex structure that we know of in the universe, i.e., two brains, the fact that through a harsh and limiting protocol of grunts and squeaks these two consciousnesses can convey subtlety and meaning is a minor natural miracle. Realizing that this is a terribly error-prone medium is often the first step in two people saving their relationship, or at least coming to terms with whatever is next for that relationship.

Security and development teams have all the same challenges as a difficult relationship under stress. Both have admirable goals but, when poorly connected, those goals can quickly drift into consternation and conflict. This is where another Domain-Driven Design concept can help us: the anti-corruption layer (ACL).

As its name suggests, an ACL exists at the boundary of a bounded context and attempts to prevent the confusion of one ubiquitous language accidentally permeating another bounded context, as shown in Figure 4-6.

Figure 4-6. How anti-corruption layers can translate and mediate between two bounded contexts, respecting the ubiquitous languages used by the two groups of people and ensuring they are not confused when the groups work together

Think of an ACL as the gatekeeper and translator between one world and another, or as a relationship counselor mediating between two people who have the potential to confuse one another. This is exactly what is needed to mediate between the worlds of your development and security teams to help them communicate effectively, and this is exactly the job that your CNAPP does, as shown in Figure 4-7.

Your CNAPP aims to bridge the worlds of development and security so that your security teams can use the language concepts that they are familiar with to define security policies in relation to important security threats and compliance frameworks. Your CNAPP then bridges that security-centric view to the concepts that your development teams work in, i.e., code for infrastructure, applications, and communications.

As an ACL, your CNAPP helps both your developers and your security-focussed teams communicate—this is step one. Beyond communication, step two is to figure out how the two groups can work together, and for that, you need to decide on the most effective interaction mode for the inter-team relationship.

Figure 4-7. Your CNAPP provides the mediation and translation between the worlds of development teams and security teams, sharing the security context between both domains

Respecting the Goals of Effective Security and Development Teams

Your CNAPP, as an ACL, enables your security and development teams to be able to communicate, but that’s just half the equation. You want the two teams to work together, to collaborate, and to do that, you need to understand how each group does their work effectively. What does a good day look like3 for a security-focussed team and a development team?

The good news is that the days look quite similar, and additionally, the aspects that ruin those good days also look the same for both groups.

A great development team looks to generate a confident flow of value for their customers. The metrics used by Google’s DevOps Research and Assessment (DORA)4 team provide the pointers as to what that confident flow should measurably look like:

Deploy frequently

As it says on the tin, how frequently can you confidently deploy change to production? The more frequently you can, the better.

Low lead time for changes

How long does it take change to flow through your teams into production? The longer this delay, the more likely changes will be batched, the more likely a change will risk failure, and the more frustrated your development teams will be.

Low change failure rate

How often do you have to roll back a change? Or roll it forward? How often do you need to fix something after you have confidently deployed it? The more often, the more frustrating for your development teams.

Low mean time to recover (MTTR)

How long does it take you to recover from a problem? Whether it’s a small error or a big problem, how long does it take you to notice, adjust, align, and ship the fix? This often relies on fast deployment frequency and a low lead time for changes—and the faster, the better.

You could use the exact same set of metrics for your security teams, adding that everything should be done securely. Change should be shipped frequently and securely, with a low lead time for change, a low rate of change failures due to insecurity, and a low mean time to recover from security problems. This is the opposite of the old perspective of security as a gate. Security is there to support the development team’s goals, not stand in the way of them.

To generate this confident and secure flow of value, development teams need to focus. Every meeting, every notification, is a potential distraction. Development is deep work, and so there is an exponential impact between distractions and performance. A distracted development team is a frustrated and unproductive development team. The most important commodity to protect is their attention as they seek to generate that flow of valuable change.

But your security-focussed teams need to work together with your development teams. How do you avoid that interaction between teams becoming a distraction? How do you stop security being the reason flow is broken? How do you stop the cries of “Not more meetings, please!”?

Team interaction modes from Team Topologies, that’s how.

Team Interaction Modes

The Team Topologies5 system, developed by Matthew Skelton and Manuel Pais, defines several different types of teams based on what those teams focus on:

Stream-aligned teams

Teams focussed on generating a flow of valuable change, i.e. a stream, often in support of a specific set of business needs. A good example here might be the teams responsible for delivering a payment API for a bank.

Platform teams

Focussed on providing platforms of useful services that make everyone else’s life easier. A good example here might be a team responsible for providing infrastructure as a service, e.g., managing Kubernetes clusters for other teams to utilize.

Complicated-subsystem teams

Focussed on managing a particularly complex area of a technology landscape, building deep knowledge of that area in the process. A good example here could be the team charged with understanding and evolving a complex legacy core banking system.

Enabling teams

Focussed on helping other teams do something, learn something, or simply handle some of the work that the teams would rather not be burdened with. A good example here might be a team responsible for promoting quality practices across other teams, perhaps focussing on security awareness and practices.

Your development teams are likely a combination of all four types of teams, and your security-focussed teams are likely operating like an enabling team. Recognizing the types of teams you have, though, is less important here than understanding the different ways they may work together. Those interactions are captured as three different modes of team interaction:

Collaboration

Two teams closely working together for a defined period of time to meet a shared goal.

X-as-a-service

A team provides something “as a service.” An example here might be where a team provides a managed Kubernetes cluster or a higher-level billing system as a service that other teams simply consume. The style of interaction would be requests for changes to these services, rather than direct collaboration between team members.

Facilitation

An interaction mode where one team helps, consults, advises, or even mentors another team.

Team interaction modes can change over time. What starts as close collaboration can shift to an ongoing facilitation relationship, or even evolve to one team providing some product “as a service” to another team.

Only you can decide what is the best interaction mode between your security-focussed teams and your development teams. You could find that close collaboration is necessary right now, for a specific amount of time, while security is embedded into the daily practices and flow of delivery of your development teams. Over time, you might evolve that to a more hands-off facilitation relationship and then, to help things scale, you might look to shift some of that facilitation and collaboration into services that can directly support secure development practices.

It is exactly this evolution and balance between collaboration and security as a service that your CNAPP aims to strike. Your CNAPP is a platform of security services that don’t look to get in the way of your development team’s flow, but support them in applying security as they do their work.

Your CNAPP opens up the possibility for close collaboration when necessary between your security teams and development teams, and as-a-service scalable support for business as usual. Your CNAPP brings security carefully to your development teams, as early as possible in the development process; it brings it while maintaining flow, securing your developers’ work without causing friction.

It’s now time to see how that’s done. It’s time for a day in the life of a security-aware development team.

CNAPP as a Development Collaborator

In the beginning, there was the deluge. In Chapter 3, we’d uncovered a shadow cloud and, throwing the light on with our CNAPP’s security policies, helpfully framed by our security threat and compliance frameworks, we now could see the size of our problem. We’d observed, oriented, and aligned. It was now time to act.

In cloud native systems, almost everything is code. It’s always been the case that development teams have wrestled with custom development code as part of their daily work, but in a cloud native engineering ecosystem, infrastructure is also code. This makes the surface area for vulnerabilities, and for your CNAPP to help, broad and deep; it needs to surface vulnerabilities and calls to action across all source code assets.

Inspecting Your CNAPP Policies

Taking a peek at the security policies in your CNAPP gives you a view of what security policies and frameworks you have that relate to actions you might be able to take on your codebases, as shown in Figure 4-8.

Figure 4-8. A sample screenshot from the Prisma Cloud CNAPP view of security policies

You can drill down to only those policies that affect the early stages of your software development lifecycle. In Figure 4-8, this has been done by selecting the CNAPP security policies that relate to the build phase. Each of these policies is ready to examine your source code, from infrastructure as code through to custom application code, for vulnerabilities, bringing the security context to any issues it finds, including the level of severity and the relationships involved.

Each issue your CNAPP finds is labeled as a resource policy violation. Violations combine all the information about the policy with the alerts and actions that you might take to resolve the vulnerabilities observed in the resource.

That’s all well and good, but if you had to keep checking back to your CNAPP as you built your code, that would a) be too late, as you will have already committed the code to your version control system, and b) make for a noisy, clunky experience that would get in the way of your development flow. The views in your CNAPP are great at bringing the broader picture to light when you want to see how your codebases relate to vulnerabilities across your whole software development lifecycle, but it would be better if you didn’t need to check there in the first place. Shifting left shouldn’t mean constantly seeking information in yet another UI, even one as useful as your CNAPP’s.

Security awareness needs to be where you work. It needs to be in your IDE and your VCS.

Surfacing Security Where You Work

Developers primarily work on the command line, their IDEs and, after a git push, in their version control system (VCS), (e.g., Git).6 To truly shift security awareness and action left, your CNAPP needs to embrace these locations and become a great collaborator among your existing toolset, as shown in Figure 4-9.

Figure 4-9. An overview of all the types of developer workflow tool integrations you can expect from your CNAPP

From the command line, you can use CNAPP-aware tools such as Checkov7 to surface violations directly in your local codebase as you manipulate, compile, and interpret your source code files. Figure 4-10 shows Checkov detecting misconfigurations and other violations in your local Terraform code.

Figure 4-10. Checkov provides summaries and detailed security context from your CNAPP in your command line so you can inspect violations in your local source code files

Checkov plugs into your CNAPP and uses the extra security context to help you navigate and understand the violations you might be seeing, or creating, in your local code, especially that bane of every IaC developer: misconfigurations. That’s a helpful start, but by jumping into the IDE, your CNAPP-aware tools can do more—not just tell you there is a violation, but offer you an immediate fix.

Security Awareness and Immediate Fixes in the IDE

It’s rare that a developer will do a whole lot of work in the command line; they prefer to jump into their integrated development environment as soon as working on any code is required. This is the second integration point for your CNAPP: integrating with your developer’s IDE to bring security awareness to the code they are working on right now.

Plugging your CNAPP into your IDE means you can see violations just like any other warnings in your code, accompanied by that useful security awareness and prioritization too, as shown in Figure 4-11.

Each warning links through to the CNAPP’s security policy and security context; you might need to decide whether to skip fixing the violation for now, or look to apply a fix straight away, as shown in Figure 4-12.

Figure 4-11. The very same security violations are surfaced in an IDE such as Visual Studio Code8 as source code warnings
Figure 4-12. Your CNAPP’s integration doesn’t just show you the problems; it also offers potential solutions if they are available. Here, you can see VS Code’s integration with a CNAPP, offering the possibility to add a skip comment for a particular violation, leaving the problem to fix later, or allowing you to apply a fix right away.

This integration into your IDE really starts to make your CNAPP a collaborator in your development work, especially when you explore the immediate fixes that your CNAPP can offer. Your CNAPP smooths the flow by bringing these fixes directly into where you’re working, and these fixes can be as simple as static code snippets for you to apply and customize if necessary, while some of the more advanced CNAPPs learn from your own codebase to surface common fixes particular to your system. At this point, your CNAPP is working like a close collaborator on your codebase as early as possible (i.e., the moment you are working on your code). You can see security problems arise as you type, scoped to the files you are currently working on, helping you avoid issues from the very first lines of code.

But there’s another opportunity in your development workflow for your CNAPP to surface violations and fixes scoped not just to where you are currently working in your IDE, but also scoped across a whole set of changes. In GitHub terminology, a collection of changes to be mutually applied is called a pull request (PR). A PR is a package of changes created by a developer when a change is ready to be considered for addition to the main trunk of your source code. The packaged change is ready for review, and that review is another opportunity for your CNAPP to collaborate.

When a PR Is Born

When a developer has finished a collection of changes, they can be packaged up and pushed to their version control system, and a review can take place before those changes are added to the source code being built and deployed by the rest of the CI/CD pipeline.9 In the Git and GitHub source code management system, this is called raising a pull request, and it is another opportunity for your CNAPP to bring security awareness to your changes.

Your CNAPP takes all the changes being made as part of the pull request and acts as a collaborator, reviewing the source code changes to make you aware of any violations it sees, as shown in Figure 4-14.

By integrating with your VCS, not only are you able to surface the CNAPP’s security context and violations as you write your code in your IDE, but there is also a helpful safety net around your PRs that will highlight any violations still working their way into the codebase when your changes are being prepared to be built, deployed, and released.

And the same opportunity for intervention at this stage is available to you, too. In Figure 4-14, your CNAPP is alerting you that encryption is not turned on for an AWS S3 bucket, and if you look closely, you can see that it is suggesting a fix (in the green block), just like a human collaborator on the PR might.

Figure 4-14. Your CNAPP raising the same misconfiguration violations at PR review time, directly in GitHub

Just as with a human collaborator, you could choose to take that suggestion as part of your PR’s review, adding the fix prior to your code being merged into the main trunk code and beginning its march to release through your build and delivery pipeline.

Checks and Balances in the Build

The final place in the developer’s workflow where your CNAPP can function as a collaborator is in your continuous integration and delivery pipeline itself. At this point, all of the codebase is being built, and so it’s the last moment for a developer to intervene and make a correction if a violation is still present.

Your CNAPP integrates with your continuous integration and delivery tools to provide those last-minute checks for violations, so that your CNAPP’s security awareness can be brought in your automated builds, as shown in Figure 4-15.

Figure 4-15. CNAPP checks being applied and potentially blocking an automated build and deployment from your CI/CD pipeline

If a violation is found in your CI/CD checks, then your CNAPP can provide a view of all the violations in that build, even beyond the collections of changes that might have contributed to triggering the build in the first place. Jumping back to your CNAPP from your build checks, you are presented with a global view of all the violations in that build, so you can decide what to do next. Cancel the build or continue? With the context provided by your CNAPP, you can make that decision confidently.

Scope, Feedback, and (Helpful) Blame

Collectively, all the integration points between your CNAPP and your development work offer security scanning safety nets. Each point represents an opportunity for you to get timely, scoped feedback on resource policy violations and, when possible, recipes for how to immediately fix them.

But sometimes the fix alone would be missing a trick; sometimes you want to turn your fix into a learning opportunity for someone in the development team. Perhaps the fix is for a very common problem, or maybe it is such a glaring omission that it’s worth telling someone about it before anything is fixed. In these cases, your CNAPP can hook into broader feedback loops (if your VCS supports this) to turn a potential fix into training for a developer.

If you are using Git, then your CNAPP could hook into the Git blame functionality to provide just that learning opportunity. Git blame sounds bad, but it is actually just a mechanism of surfacing who worked on a particular piece of your code. From your CNAPP it is possible to use Git blame to surface the developers who worked on a particular piece of code, or focus on a developer and the types of violation they might tend to create, and then reach out to that person with the reference to the fix in your CNAPP’s code view for them to consider applying and learning from it.

Automatically Updating Your Security Posture

As you detect and fix violations in your code, in your change packages or PRs, and in your CI/CD builds, your CNAPP is automatically updating its current picture of your overall security posture. This is the power of that two-way interface between development and security teams—the conduit that is your CNAPP, as shown in Figure 4-16.

Figure 4-16. Your CNAPP enables the effective shifting left of security awareness into development workflows using integrations with the IDE, command line, VCS, and CI/CD pipelines

In this chapter, you’ve seen how your CNAPP can become a security collaborator for your developers so that resource violations can be surfaced as early as possible, and code can be made secure from the moment it is created. Those integration points and feedback loops shift shared security awareness left into the hands of your development teams as they write their own custom code. But what about the code they don’t write? What about the code they use? What about all those third-party dependencies that every cloud native application and stack relies upon? Everything from operating systems in containers upwards is still offering a route for attack as they are brought into play as your software is built. It’s time to shift a little right, away from where your developers are working, and look at what your CI/CD pipelines are constructing.

It’s time to look at the intermediate packages, third-party containers, virtual machines, libraries, and frameworks that are packaged into your built artifacts prior to delivery into production. It’s time to shift right just a notch. It’s time to secure your cloud native supply chain.

1 There is a lot more to Domain-Driven Design than just ubiquitous languages and bounded contexts but, for our purposes here, those are the concepts we need to understand to grasp the complications of security and development teams working closely together.

2 That’s a real example of the kind of convolution and confusing translation that can occur when a team ignores the domain language they are building for when naming concepts during the technical implementation of a system.

3 Tim Cochran does a deep dive through what a really effective day in a development environment looks like in his article, “Maximizing Developer Effectiveness”.

4 Brilliantly explored and explained in Accelerate: Building and Scaling High Performing Technology Organizations by Nicole Forsgren, Jez Humble, and Gene Kim (IT Revolution Press, 2018).

5 Team Topologies: Organizing Business and Technology Teams for Fast Flow by Matthew Skelton and Manuel Pais (IT Revolution Press, 2019).

6 Git is a very popular distributed version control system, and GitHub is a popular set of centralized services built on top of Git. We will be using both of these systems for the examples of bringing security awareness to your development work in this chapter. Git is a powerful and complex tool that is at the heart of a developer’s work, and to learn more about it, check out Head First Git: A Learner’s Guide to Understanding Git from the Inside Out by Raju Gandhi (O’Reilly, 2022).

7 Checkov is an open source tool for immediately surfacing misconfigurations and, when attached to your CNAPP, other violations in your local code.

8 Visual Studio Code from Microsoft

9 More on that in Chapter 5, where you explore integrating your build and supply chain with your CNAPP.

Get Cloud Native Application Protection Platforms now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.