Chapter 4. Security and Requirements
The best place to start introducing security into the systems development process is in the requirements gathering stage. While we’ve been referring to software development so far, it’s really systems development because when it comes to web applications or even backends to mobile applications, we aren’t talking about a single software package any longer. We are talking about multiple components that are installed either on virtual machines or in virtual containers. This effectively makes it systems development, even if the purpose of the full system is to deploy and provide access to applications.
When approaching systems development security, it’s really easy to panic and be afraid of everything. The best approach is not to try to address every problem that may potentially arise, particularly if it’s very unlikely for that situation to happen. The best approach is to follow good practices in hardening deployments and secure programming, but also to think rationally about threats that may remain. Even following the best hardening and secure programming practices will leave an exposure to attack simply because there will always be ways for an attacker to get in. The moment there is a program running, that program can be misused. For this reason, some technology providers, such as Microsoft, espouse the principle of “assume breach,” where you’re operating under a tacit assumption that there has already been a breach, and your job is to find it and stop it from spreading.
One way to improve the overall quality and security posture of any systems development project is to start with a threat modeling exercise. The purpose of threat modeling is to identify areas that may be misused by an attacker. Once these areas have been identified, you can develop requirements to either remove the potential threat or you can focus on mitigating the threat, meaning you are attempting to minimize the potential impact that could result from the threat being actualized.
This chapter will go deeper into how you can extract security requirements, primarily by looking at threats. Once you have identified threats, you can translate the mitigation of those threats into requirements.
Risk and Threat and Vulnerability
Software development organizations may regularly talk about risk, but often when you hear risk in the context of a software project or a systems project, the risk has to do with the timing of a release, meaning are we going to hit the promised release date or not. In one sense, this misuses the term. The concept of risk is often poorly understood across the information technology community, which leads to the word being used incorrectly.
Let’s start with a clear definition of risk. Risk is the exposure to loss, based on the likelihood of an event occurring. In other words, when you say there is a risk of rain today, you are really only talking about half of a risk calculation. You are talking about a high likelihood, perhaps. What you are missing is the loss. What loss are you going to incur if it rains today? If you are hosting an outdoor event where people have paid a lot of money for tickets and you have spent a lot of money on preparation, rain may require you to refund the tickets, while still being out the costs of preparation. There is a loss there. When you factor in the high likelihood of rain, meaning a high likelihood of that loss occurring, you have a high risk. You’d need to know how much money you were going to be out (assuming there was no chance of rescheduling the event) to determine the actual risk.
There are two types of risk assessment. The first, and preferable, type (though it’s harder to achieve) is quantitative. This means you assign numeric values to the likelihood (probability) and the loss (typically expressed in a monetary unit like dollars), and you’ll get a number out. Some people find this very hard because they either haven’t looked for or can’t determine the likelihood value and may find it difficult to assign an actual monetary value to the loss. This quantitative approach is commonly expressed as multiplying the annual rate of occurrence, which is an expression of probability meaning the number of times a year an event can be expected to occur, by the single loss expectancy (a monetary value indicating the amount that would be lost if the event occurred). You end up with an annualized loss expectancy. This is a numeric value you can assign to any given event to be able to assess the risk of that event versus any other event.
Because they find it difficult to assign values to probability and loss, people often follow the second type of risk assessment, a qualitative approach. For example, they may use T-shirt sizing—small, medium, large, and extra large—to scope both the probability and the impact. You then take the T-shirt size for both of those factors and end up with a new T-shirt size—still small, medium, large, or extra large—that you assign to an event. You can then use these T-shirt sizes to compare the risk of one event to another event.
The reason for comparing the risk of one event to that of another is that resources in any situation are limited; you can’t address or remediate every risk. This means you should prioritize high-risk events. It also means you need to take a rational approach to evaluating risk. Humans have a tendency to catastrophize. They think an event with a high impact (loss potential) is a high risk, even if the probability is extremely low. Think about flying in an airplane, for instance. Some people will say it’s a high-risk event because they can imagine the traumatic experience of the plane coming apart in the air, an event that results in death, and perhaps a very painful and scary death. The reality is that the probability of that event happening is extremely low, but people can’t get the scariness out of their heads so it “feels” like a high-risk activity.
This brings us to understanding threats. Risk assessments can be tedious activities if you are trying to evaluate the risk of every potential event, even those that are highly unlikely. As mentioned earlier, businesses have finite resources, and it is more rational to focus on events that are more likely to happen. If you think about a bell curve, such as the one in Figure 4-1, with high probability/low impact events on one tail and high impact/low probability events on the other tail, you want to focus on the events that are going to be in the middle of the bell curve because that’s where you are likely to incur losses.
Finding those events in the middle of the bell curve comes down to identifying threats. Because we are limited by our own imaginations—focusing only on threats we may be immediately aware of since we can’t easily imagine events we don’t have any exposure to—threat modeling may not be an easy activity. This is where it can be helpful to have a framework in which to identify threats. We’ll look at some common threat-modeling frameworks later on, but we should start by clearly defining what a threat is. A threat is a potential negative event that may result from a vulnerability being exploited.
A vulnerability, by extension, is a weakness in a system or piece of software. People can exploit vulnerabilities by triggering them. However, exploiting a vulnerability is not necessarily a malicious action. In fact, it’s probably helpful not to think strictly of actions that are malicious in nature. While malicious actions are bad, you can also have serious problems that result from actions that are simply errors or even mistakes. For example, with the right set of factors in place, a mistake can easily lead to a serious and prolonged outage. This, again, is why it can be helpful to have a framework to use for identifying threats.
Threat Modeling
As mentioned previously, there are several threat-modeling frameworks that are used in systems or software development. While these frameworks are not necessarily perfect in determining threats, they can help you to focus on what is most troubling. Of course, once you have identified threats, you still need to know what to do about mitigating or removing them. It takes some practice and knowledge to be able to identify both threats and mitigations to threats. As you will see in the following discussion, some frameworks are going to be better than others depending on how your organization thinks about threats. You can also mix and match the different threat modeling approaches instead of using one exclusively.
It’s important to note here that the goal of a threat model is not to eliminate threats. In any system or software application that interacts with users, especially remote users, it’s not possible to eliminate threats. The goal of developing a threat model is to better understand the interactions within complex systems to find areas where you can limit either the impact or likelihood of a threat manifesting. Ideally, you reduce the overall risk resulting from these threats to a level the business is comfortable with. We’re going to take a look at three commonly used threat-modeling frameworks: STRIDE, DREAD, and PASTA.
Note
One important element of risk that isn’t included in the previous definition is the need to be informed. Businesses regularly make decisions based on risk assessments. It’s not possible to make an informed decision if the risk is not clearly understood or even identified. Pretending a risk doesn’t exist, incorrectly identifying likelihood or impact, or simply not assessing the risk at all means the business has not made an informed decision.
STRIDE
STRIDE, a model introduced by Microsoft in 1999, is a commonly used approach to assessing threats, especially within software development processes. It’s based on a set of threat categories identified by the developers of this methodology. The following categories, which form the STRIDE acronym, help to better identify problem areas within a complex system:
- Spoofing
-
Spoofing is an attempt by one entity to falsify data in order to pretend to be another entity. This may be one user pretending to be another user, or it may be one system, in the form of an IP address for instance, pretending to be another system. Spoofing can impact confidentiality or integrity in any system. One way to protect against spoofing is strong authentication and data verification.
- Tampering
-
Integrity is one of the three essential security properties—confidentiality, integrity, and availability. Tampering is when data is altered, meaning it has lost its integrity, since it is not in the same state when it is retrieved as it was the last time it was stored. Tampering attacks can be remediated with strong verification using techniques like machine authentication codes.
- Repudiation
-
Repudiation is any entity being able to say it didn’t perform an action. An example is someone writing a check then later saying that they didn’t write the check, even though their signature appears on the check. As signatures can be falsified, without witnesses it may be impossible to say with certainty who wrote out and signed the check. Any action that can’t be clearly assigned to an entity may violate the concept of non-repudiation.
- Information disclosure
-
A privacy breach or inadvertent leak of data is an information disclosure violation. The use of encryption can be one way to protect against information disclosure, but it’s not a perfect solution since keys can be stolen and used to decrypt information, resulting in a disclosure. Encryption without appropriate key management is not sufficient to protect against information disclosure.
- Denial of service
-
Anytime an application or service is unavailable to a user when the user expects it to be available is a denial of service. The same is true when a user expects to be able to get to data and that data is unavailable. These types of attacks can’t always be protected against since some of them are simply outside the control of the system developer. However, ensuring applications are resistant to crashing is a good start.
- Elevation of privilege
-
Attackers who manage to get control of a running process will have the level of permission or privilege assigned to the user that owns that process. Commonly, this is a low level of access, which means the attacker is often going to attempt to obtain elevated or escalated privileges so they can do more on the system they have compromised. Any ability to move from a low level of privilege to a higher level of privilege is privilege escalation, also called elevation of privilege. By always using the principle of least privilege, that is, never giving any user or process more permissions than it needs to perform essential tasks, you can help protect against privilege escalations.
Having an understanding of these categories helps to shape your threat-modeling actions, but sometimes you need some additional support. Microsoft offers a Threat Modeling Tool, which allows you to diagram your application, including defining all interactions between your components and how those interactions may be implemented. You can see an example diagram in Figure 4-2, which is a sample that comes with the Threat Modeling Tool. Once you have diagrammed and defined your solution, the tool will automatically generate a report of threats for you.
The tool follows the STRIDE model in developing the list of threats, so you will find threats identified with the categories discussed earlier. Figure 4-2 includes threats like Potential Data Repudiation by OS Process, Potential Process Crash, Weak Access Control for a Resource, and Spoofing the OS Process. One advantage of the Threat Modeling Tool is that it not only identifies potential threats for you, but also provides the means to manage those threats by either redesigning the system and running the tool again or by documenting mitigations that may be implemented to reduce the likelihood or impact of the threat being actualized.
DREAD
DREAD was initially proposed as a threat-modeling methodology, but in fact, it probably works better for risk assessment and can be used in conjunction with STRIDE. Once you have identified your threats, you can run each one through the DREAD model to help clearly identify risks that may result from them. One way of implementing this for quantitative assessment is to give a rating of 1 to 10 for each category. You’ll end up with a numeric value to assign to each threat, which can help you better derive risk. One problem with this approach, as is the case with risk assessment in general, is that it is subjective without hard data, such as previous experience. Just as with STRIDE, DREAD is an acronym for the categories laid out in the following:
- Damage
-
If an event happened, how bad would the damage be?
- Reproducibility
-
How easy is it for this event to occur, meaning what is the level of effort or level of difficulty involved in making this event occur? Reproducibility may be much higher if there is a widely available proof of concept or exploit available, as it requires nothing but the ability of the attacker to find the exploit.
- Exploitability
-
Exploitability may seem the same as reproducibility but there are subtle differences between whether an attack can be reproduced and how exploitable it is. Let’s say it’s easy to reproduce the attack, but in each run through you get a different result. Not all of the attack attempts end up giving the attacker access to the system. Sometimes, the application under attack just ends up shrugging off the attack. Other times, the attacker gets control of the process space. Exploitability may be low in this case while reproducibility is high.
- Affected Users
-
How many users are going to be impacted? You may also factor in the type of user who is impacted. Let’s say customers can get access to the application but the application can’t be managed by the operations team, for instance. You may want to factor in the level of the user and rank users by how important it is for them to get access.
- Discoverability
-
How easy is it to discover that the exploit is possible? As before, this may be a function of whether details about the vulnerability and exploit are available publicly.
As you can see, there can be a lot of subjectivity in each of these categories. Microsoft used this model for a period of time but it is no longer in use there. Some of the categories here are similar to those used by the Common Vulnerabilities Scoring System (CVSS), which uses a set of factors to generate a severity score for a known vulnerability. This can help you make decisions about whether to remediate the vulnerability quickly or whether the remediation can wait. DREAD can be used in the same way: it’s another data point that can be used to determine what to do about a threat that has been identified. You can use the same DREAD model for vulnerability assessment once a vulnerability has been identified.
PASTA
PASTA is the Process for Attack Simulation and Threat Analysis, and rather than being a threat model in the way STRIDE is, it is a process that can be used to identify threats and mitigations for them. PASTA is a seven-step process:
- 1. Define objectives
-
As always, it’s better to clearly define the problem before looking for a solution. Rather than trying to tackle everything at once, this step clearly defines what is in scope for this assessment. You may choose to look only at critical assets or critical data sources, for instance. You may also define the tools and testing methods in this step.
- 2. Define technical scope
-
Complex systems have a lot of dependencies, so it may be essential to clearly define the technical scope to limit the inquiry. You don’t want to be digging into libraries, for instance, that are out of your control. You may want to limit yourself to only one part of the system rather than the entire system.
- 3. Decomposition and analysis of application
-
This is where you start drilling into the way the application or system is composed. This may be similar to what was done earlier for the Microsoft Threat Modeling Tool. You define the individual components or elements and also identify trust boundaries. This is where data may move from outside the application to inside the application, for instance. This means you are moving from an untrustworthy zone (where the user lives) to a trustworthy zone (where the application controls the data and developers may assume the data has been sanitized—not a good assumption as a general rule, but an example of why you might say one zone is trustworthy).
- 4. Threat analysis
-
Based on intelligence sources, assess the known threats. For example, you may use components that have known vulnerabilities and there may be exploits for those vulnerabilities, or you may be exposed to common known vulnerabilities because of development practices used.
- 5. Attack/exploit enumeration and modeling
-
In this stage, you use an attack tree to model what an attack might look like and how it may operate. An attack tree is a way of diagramming a process with decision points or options along the way. You may make different decisions about how you handle an attack based on the path through the model. An example of an attack tree is shown in Figure 4-3.
- 6. Analyze modeling and simulation
-
Once you have created your attack trees, you can start to understand the potential for damage from the attack. This is done by running simulations of the attack and determining its likelihood.
- 7. Risk and impact assessment
-
Once you have identified the likelihood, you should also assess the impact based on a successful execution of the attack, and then determine the risk. Based on all the data that has been collected, you should be determining what controls you can put in place to mitigate the risk from the potential attack.
You’ll see that PASTA is a very detailed approach to threat assessment and there is nothing here that would necessarily prevent you from folding in other approaches. You could use STRIDE as you are looking for attacks and exploits by looking for places where information disclosure is possible, for instance. You might also use DREAD as you are doing the risk and impact assessment to give you a broader view of the aspects of the attack that may impact the system.
Summary
You always need to have a starting point when you are developing something. You need to define the problem before you start working on the solution. If you don’t, how do you know if your solution fits the problem? Without a clear definition, you have a solution in search of a problem, which is not a great way to try to sell or market anything. The same is true when it comes to addressing security for any system or application. You need to know what it is you are protecting against, since you can’t protect against everything. A good place to start is by identifying threats, which will help you better identify potential mitigations to address those threats. Once you know what threats you face, you can start to generate requirements based on the threats identified.
The problem then becomes how to identify threats. There are some methodologies that can be used, including the STRIDE methodology, which identifies six categories of threats: spoofing, tampering, repudiation, information disclosure, denial of service, and elevation of privilege. As you are assessing your system or software, you should be looking for places where your application may introduce the potential for an incident in one of these categories.
Another framework to help better understand the impact from a threat that has been identified is DREAD. Using DREAD, you look at damage, reproducibility, exploitability, affected users, and discoverability. Assessing the questions associated with these factors will help you get a better understanding of the overall risk associated with a threat because they will give you a deeper insight into the probability and loss that might result from a threat being actualized.
While STRIDE provides a set of categories, PASTA offers a process. PASTA is the Process for Attack Simulation and Threat Analysis. It is a highly structured approach to identifying threats and their impact and likelihood. To implement PASTA, you need to be able to deconstruct the application, identifying trust zones and how data passes through the different components. Once you have run through the PASTA process, you can still identify categories using STRIDE and help understand the risk using DREAD.
Get DevSecOps in Kubernetes now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.