Chapter 1. Introduction

Join us as we explore the many perilous paths through a pod and into Kubernetes. See the system from an adversary’s perspective: get to know the multitudinous defensive approaches and their weaknesses, and revisit historical attacks on cloud native systems through the piratical lens of your nemesis: Dread Pirate Captain Hashjack.

Kubernetes has grown rapidly, and has historically not been considered to be “secure by default.” This is mainly due to security controls such as network and pod security policies not being enabled by default on vanilla clusters.

Note

As authors we are infinitely grateful that our arc saw the cloud native enlightenment, and we extend our heartfelt thanks to the volunteers, core contributors, and Cloud Native Computing Foundation (CNCF) members involved in the vision and delivery of Kubernetes. Documentation and bug fixes don’t write themselves, and the incredible selfless contributions that drive open source communities have never been more freely given or more gratefully received.

Security controls are generally more difficult to get right than the complex orchestration and distributed system functionality that Kubernetes is known for. To the security teams especially, we thank you for your hard work! This book is a reflection on the pioneering voyage of the good ship Kubernetes, out on the choppy and dangerous free seas of the internet.

Setting the Scene

For the purposes of imaginative immersion: you have just become the chief information security officer (CISO) of the start-up freight company Boats, Cranes & Trains Logistics, herein referred to as BCTL, which has just completed its Kubernetes migration.

captain

The company has been hacked before and is “taking security seriously.” You have the authority to do what needs to be done to keep the company afloat, figuratively and literally.

Welcome to the job! It’s your first day, and you have been alerted to a credible threat against your cloud systems. Container-hungry pirate and generally bad egg Captain Hashjack and their clandestine hacker crew are lining up for a raid on BCTL’s Kubernetes clusters.

If they gain access, they’ll mine Bitcoin or cryptolock any valuable data they can find. You have not yet threat modeled your clusters and applications, or hardened them against this kind of adversary, and so we will guide you on your journey to defend them from the salty Captain’s voyage to encode, exfiltrate, or plunder whatever valuables they can find.

The BCTL cluster is a vanilla Kubernetes installation using kubeadm on a public cloud provider. Initially, all settings are at the defaults.

Tip

Historical examples of marine control system instability can be seen in the film Hackers (1995), where Ellingson Mineral Company’s oil tankers fall victim to an internal attack by the company’s CISO, Eugene “The Plague” Belford.

To demonstrate hardening a cluster, we’ll use an example insecure system. It’s managed by the BCTL site reliability engineering (SRE) team, which means the team is responsible for securing the Kubernetes master nodes. This increases the potential attack surface of the cluster: a managed service hosts the control plane (master nodes and etcd) separately, and their hardened configuration prevents some attacks (like a direct etcd compromise), but both approaches depend on the secure configuration of the cluster to protect your workloads.

Let’s talk about your cluster. The nodes run in a private network segment, so public (internet) traffic cannot reach them directly. Public traffic to your cluster is proxied through an internet-facing load balancer: this means that the ports on your nodes are not directly accessible to the world unless targeted by the load balancer.

Running on the cluster there is a SQL datastore, as well as a frontend, API, and batch processor.

The hosted application—a booking service for your company’s clients—is deployed in a single namespace using GitOps, but without a network policy or pod security policy as discussed in Chapter 8.

Note

GitOps is declarative configuration deployment for applications: think of it like traditional configuration management for Kubernetes clusters. You can read more at gitops.tech and learn more on how to harden Git for GitOps in this whitepaper.

Figure 1-1 shows a network diagram of the system.

System architecture diagram
Figure 1-1. The system architecture of your new company, BCTL

The cluster’s RBAC was configured by engineers who have since moved on. The inherited security support services have intrusion detection and hardening, but the team has been disabling them from time to time as they were “making too much noise.” We will discuss this configuration in depth as we press on with the voyage. But first, let’s explore how to predict security threats to your clusters.

Starting to Threat Model

Understanding how a system is attacked is fundamental to defending it. A threat model gives you a more complete understanding of a complex system, and provides a framework for rationalising security and risk. Threat actors categorize the potential adversaries that a system is configured to defend against.

Note

A threat model is like a fingerprint: every one is different. A threat model is based upon the impact of a system’s compromise: a Raspberry Pi hobby cluster and your bank’s clusters hold different data, have different potential attackers, and very different potential problems if broken into.

Threat modeling can reveal insights into your security program and configuration, but it doesn’t solve everything—see Mark Manning’s comments on CVEs in Figure 1-2. You should make sure you are following basic security hygiene (like patching and testing) before considering the more advanced and technical attacks that a threat model may reveal. The same is true for any security advice.

Mark Manning on CVEs
Figure 1-2. Mark Manning’s insight on vulnerability assessment and CVEs

If your systems can be compromised by published CVEs and a copy of Kali Linux, a threat model will not help you!

Threat Actors

Your threat actors are either casual or motivated. Casual adversaries include:

  • Vandals (the graffiti kids of the internet generation)

  • Accidental trespassers looking for treasure (which is usually your data)

  • Drive-by “script kiddies,” who will run any code they find on the internet if it claims to help them hack

Casual attackers shouldn’t be a concern to most systems that are patched and well configured.

Motivated individuals are the ones you should worry about. They include insiders like trusted employees, organized crime syndicates operating out of less-well-policed states, and state-sponsored actors, who may overlap with organized crime or sponsor it directly. “Internet crimes” are not well-covered by international laws and can be hard to police.

Table 1-1 can be used as a guide threat modeling.

Table 1-1. Taxonomy of threat actors
Actor Motivation Capability Sample attacks

Vandal: script kiddie, trespasser

Curiosity, personal fame.

Fame from bringing down service or compromising confidential dataset of a high-profile company.

Uses publicly available tools and applications (Nmap, Metasploit, CVE PoCs). Some experimentation. Attacks are poorly concealed. Low level of targeting.

Small-scale DOS.

Plants trojans.

Launches prepackaged exploits for access, crypto mining.

Motivated individual: political activist, thief, terrorist

Personal, political, or ideological gain.

Personal gain to be had from exfiltrating and selling large amounts of personal data for fraud, perhaps achieved through manipulating code in version control or artifact storage, or exploiting vulnerable applications from knowledge gained in ticketing and wiki systems, OSINT, or other parts of the system.

Personal kudos from DDOS of large public-facing web service.

Defacement of the public-facing services through manipulation of code in version control or public servers can spread political messages amongst a large audience.

May combine publicly available exploits in a targeted fashion. Modify open source supply chains. Concealing attacks of minimal concern.

Phishing.

DDOS.

Exploit known vulnerabilities to obtain sensitive data from systems for profit and intelligence or to deface websites.

Compromise open source projects to embed code to exfiltrate environment variables and Secrets when code is run by users. Exported values are used to gain system access and perform crypto mining.

Insider: employee, external contractor, temporary worker

Discontent, profit.

Personal gain to be had from exfiltrating and selling large amounts of personal data for fraud, or making small alterations to the integrity of data in order to bypass authentication for fraud.

Encrypt data volumes for ransom.

Detailed knowledge of the system, understands how to exploit it, conceals actions.

Uses privileges to exfiltrate data (to sell on).

Misconfiguration/”codebombs” to take service down as retribution.

Organized crime: syndicates, state-affiliated groups

Ransom, mass extraction of PII/credentials/PCI data.

Manipulation of transactions for financial gain.

High level of motivation to access datasets or modify applications to facilitate large-scale fraud.

Crypto-ransomware, e.g., encrypt data volumes and demand cash.

Ability to devote considerable resources, hire “authors” to write tools and exploits required for their means. Some ability to bribe/coerce/intimidate individuals. Level of targeting varies. Conceals until goals are met.

Social engineering/phishing.

Ransomware (becoming more targeted).

Cryptojacking.

RATs (in decline).

Coordinated attacks using multiple exploits, possibly using a single zero-day or assisted by a rogue individual to pivot through infrastructure (e.g., Carbanak).

Cloud service insider: employee, external contractor, temporary worker

Personal gain, curiosity.

Unknown level of motivation, access to data should be restricted by cloud provider’s segregation of duties and technical controls.

Depends on segregation of duties and technical controls within cloud provider.

Access to or manipulation of datastores.

Foreign Intelligence Services (FIS): nation states

Intelligence gathering, disrupt critical national infrastructure, unknown.

May steal intellectual property, access sensitive systems, mine personal data en masse, or track down specific individuals through location data held by the system.

Disrupt or modify hardware/software supply chains. Ability to infiltrate organizations/suppliers, call upon research programs, develop multiple zero-days. Highly targeted. High levels of concealment.

Stuxnet (multiple zero-days, infiltration of 3 organizations including 2 PKI infrastructures with offline root CAs).

SUNBURST (targeted supply chain attack, infiltration of hundreds of organizations).

Note

Threat actors can be a hybrid of different categories. Eugene Belford, for example, was an insider who used advanced organized crime methods.

Captain Hashjack is a motivated criminal adversary with extortion or robbery in mind. We don’t approve of their tactics—they don’t play fair, and they are a cad and a bounder—so we shall do our utmost to thwart their unwelcome interventions.

The pirate crew has been scouting for any advantageous information they can find online, and have already performed reconnaissance against BCTL. Using open source intelligence (OSINT) techniques like searching job postings and LinkedIn skills of current staff, they have identified technologies in use at the organization. They know you use Kubernetes, and they can guess which version you started on.

Your First Threat Model

To threat model a Kubernetes cluster, you start with an architecture view of the system as shown in Figure 1-3. Gather as much information as possible to keep everybody aligned, but there’s a balance: ensure you don’t overwhelm people with too much information.

Example Kubernetes attack vectors
Figure 1-3. Example Kubernetes attack vectors (Aqua)
Tip

You can learn more about threat modeling Kubernetes with ControlPlane’s O’Reilly course: Kubernetes Threat Modeling.

This initial diagram might show the entire system, or you may choose to scope only one small or important area such as a particular pod, nodes, or the control plane.

A threat model’s “scope” is its target: the parts of the system we’re currently most interested in.

Next, you zoom in on your scoped area. Model the data flows and trust boundaries between components in a data flow diagram like Figure 1-3. When deciding on trust boundaries, think about how Captain Hashjack might try to attack components.

An exhaustive list of possibilities is better than a partial list of feasibilities.

Adam Shostack, Threat Modeling

Now that you know who you are defending against, you can enumerate some high-level threats against the system and start to check if your security configuration is suitable to defend against them.

To generate possible threats you must internalize the attacker mindset: emulate their instincts and preempt their tactics. The humble data flow diagram in Figure 1-4 is the defensive map of your silicon fortress, and it must be able to withstand Hashjack and their murky ilk.

Kubernetes data flow diagram
Figure 1-4. Kubernetes data flow diagram (GitHub)
Tip

Threat modeling should be performed with as many stakeholders as possible (development, operations, QA, product, business stakeholders, security) to ensure diversity of thought.

You should try to build the first version of a threat model without outside influence to allow fluid discussion and organic idea generation. Then you can pull in external sources to cross-check the group’s thinking.

Now that you have all the information you can gather on your system, you brainstorm. Think of simplicity, deviousness, and cunning. Any conceivable attack is in scope, and you will judge the likelihood of the attack separately. Some people like to use scores and weighted numbers for this, others prefer to rationalize the attack paths instead.

Capture your thoughts in a spreadsheet, mindmap, a list, or however makes sense to you. There are no rules, only trying, learning, and iterating on your own version of the process. Try to categorize threats, and make sure you can review your captured data easily. Once you’ve done the first pass, consider what you’ve missed and have a quick second pass.

Then you’ve generated your initial threats—good job! Now it’s time to plot them on a graph so they’re easier to understand. This is the job of an attack tree: the pirate’s treasure map.

Attack Trees

An attack tree shows potential infiltration vectors. Figure 1-5 models how to take down the Kubernetes control plane.

Attack trees can be complex and span multiple pages, so you can start small like this branch of reduced scope.

This attack tree focuses on denial of service (DoS), which prevents (“denies”) access to the system (“service”). The attacker’s goal is at the top of the diagram, and the routes available to them start at the root (bottom) of the tree. The key on the left shows the shapes required for logical “OR” and “AND” nodes to be fulfilled, which build up to the top of the tree: the negative outcome. Confusingly, attack trees can be bottom-up or top-down: in this book we exclusively use bottom-up. We walk through attack trees later in this chapter.

Kubernetes attack tree
Figure 1-5. Kubernetes attack tree (GitHub)
Tip

Kelly Shortridge’s in-browser security decision tree tool Deciduous can be used to generate these attack trees as code.

As we progress through the book, we’ll use these techniques to identify high-risk areas of Kubernetes and consider the impact of successful attacks.

Tip

A YAML deserialization Billion laughs attack in CVE-2019-11253 affected Kubernetes to v1.16.1 by attacking the API server. It’s not covered on this attack tree as it’s patched, but adding historical attacks to your attack trees is a useful way to acknowledge their threat if you think there’s a high chance they’ll reoccur in your system.

Example Attack Trees

It’s also useful to draw attack trees to conceptualize how the system may be attacked and make the controls easier to reason about. Fortunately, our initial threat model contains some useful examples.

These diagrams use a simple legend, described in Figure 1-6.

Attack Tree Legend
Figure 1-6. Attack tree legend

The “Goal” is an attacker’s objective, and what we are building the attack tree to understand how to prevent.

The logical “AND” and “OR” gates define which of the child nodes need completing to progress through them.

In Figure 1-7 you see an attack tree starting with a threat actor’s remote code execution in a container.

Attack Tree: Compromised Container
Figure 1-7. Attack tree: compromised container

You now know what you want to protect against and have some simple attack trees, so you can quantify the controls you want to use.

Prior Art

At this point, your team has generated a list of threats. We can now cross-reference them against some commonly used threat modeling techniques and attack data:

This is also a good time to draw on preexisting, generalized threat models that may exist:

No threat model is ever complete. It is a point-in-time best effort from your stakeholders and should be regularly revised and updated, as the architecture, software, and external threats will continually change.

Software is never finished. You can’t just stop working on it. It is part of an ecosystem that is moving.

Moxie Marlinspike

Conclusion

Now you are equipped with the basics: you know your adversary, Captain Hashjack, and their capabilities. You understand what a threat model is, why it’s essential, and how to get to the point where you have a 360° view on your system. In this chapter we further discussed threat actors and attack trees and walked through a concrete example. We have a model in mind now so we’ll explore each of the main Kubernetes areas of interest. Let’s jump into the deep end: we start with the pod.

Get Hacking Kubernetes now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.