Chapter 1. Modeling Systems

All models are wrong, but some are useful.

G. E. P. Box, “Science and Statistics,” Journal of the American Statistical Association, 71 (356), 791–799, doi:10.1080/01621459.1976.10480949.

System modeling (creating abstractions or representations of a system) is an important first step in the threat modeling process. The information you gather from the system model provides the input for analysis during the threat modeling activity.

In this chapter we’ll cover different types of system models, the reasons why you might choose to use one model type over another, and guidance for creating the most effective system models. Expert proficiency of system model construction will inform your threat models and lead to more precise and effective analysis and threat identification.

Note

Throughout this chapter, we use the words model or modeling to mean an abstraction or representation of a system, its components, and interactions.

Why We Create System Models

Imagine, if you will, a group of Benedictine monks looking at the monastic church of St. Gall and then at a manuscript, back and forth. At some point, one turns to the others and says, “Well, listen, it was not a plan per se. It was more like a ‘two-dimensional meditation on the ideal early medieval monastic community.’”1 Such is the purpose associated with the Plan of St. Gall, currently recognized as the oldest preserved 2D visualization and plan of a building complex. The church looks very different from the plan.

Humans create models to plan ahead or to decide what resources might be needed, what frameworks need to be put in place, what hills need moving, what valleys need filling, and how independent pieces will interact once put together. Humans create models because it is easier to visualize changes on a schematic, smaller scale than to embark on construction right away. It is easier and cheaper to make changes to that schematic, and change the ways these pieces interact, than to move walls, frames, screws, engines, floors, wings, firewalls, servers, functions, or lines of code, after the fact.

We also recognize that while the model and the final outcome may differ, having a model will always help understanding nuances and details relevant to the process of making and building. For security purposes, we model software and hardware systems, in particular, because it enables us to subject the systems to theoretical stress, understand how that stress will impact the systems before they are implemented, and see the systems holistically so we can focus on vulnerability details as needed.

In the rest of this chapter, we’ll show you the various visual forms your threat model can take and explain how to gather requisite information to support system analysis. The specific actions you take after you have constructed your model will depend on the methodology you choose to follow; we’ll get to the methodologies in the next couple of chapters.

System Modeling Types

As you know, systems can be complex, with many moving parts and interactions occurring among components. Humans are not born with extensive security knowledge (although we know a few who may well have been), and most system designers and developers are not intimately familiar with how functionality can be abused or misused. So those who want to make sure their system analysis is both practical and effective need to reduce the complexity and amount of data to consider for analysis and maintain the right amount of information.

This is where system modeling, or an abstraction of a system describing its salient parts and attributes, comes to the rescue. Having a good abstraction of the system you want to analyze will give you enough of the right information to make informed security and design decisions.

Models have been used to express ideas or deliver knowledge to others for centuries. The tomb builders of ancient China would create models of buildings,2 and architects since the Ancient Egyptians have routinely created scale models to demonstrate the feasibility and intentions of a design.3

Creating a system model—an abstraction or representation of a system to be analyzed for threats—can make use of one or more model types:4

Data flow diagrams

Data flow diagrams (DFDs) describe the flow of data among components in a system and the properties of each component and flow. DFDs are the most used form of system models in threat modeling and are supported natively by many drawing packages; shapes in DFDs are also easy for humans to draw by hand.

Sequence diagrams

These are activity diagrams in Unified Modeling Language (UML) that describe the interactions of system components in an ordered fashion. Sequence diagrams can help identify threats against a system because they allow a designer to understand the state of the system over time. This allows you to see how the system’s properties, and any assumptions or expectations about them, change over the course of its operation.

Process flow diagrams

Process flow diagrams (PFDs) highlight the operational flow through actions among components in a system.

Attack trees

Attack trees depict the steps along a path that an attacker might try as part of reaching their goal to perform actions with nefarious intent.

Fishbone diagrams

Also known as cause-and-effect or Ishikawa diagrams, these show the relationships between an outcome and the root cause(s) that enabled such an effect to occur.

Separately or together, you can use these system-modeling techniques to effectively see the changes in security posture that make an attacker’s job easier. This is important to help designers recognize and eliminate potential issues by changing their designs or assumptions of the system. Use different model types for the purposes for which they are best suited. For example, use DFDs to describe relationships between objects, and use sequence diagrams to describe ordering of operations. We’ll explore each of these in some detail, so you can understand the benefits of each.

Data Flow Diagrams

When modeling a system to perform security analysis, experts identified DFDs as a useful way to visually describe a system. DFDs were developed with a symbology that expresses the complexity of systems.

Using models to understand the components of a system and how they relate to each other came about in the 1950s with the functional flow block diagram. Later, in the 1970s, the structured analysis and design technique introduced the concept of a DFD.5 DFDs have become a standard way to describe a system when performing threat analysis.

While there is no formal standard that defines the shapes used when modeling the data flow of a system, many drawing packages use a convention to associate shapes and their meaning and use.

When constructing DFDs, we find it useful to highlight particular architectural elements alongside the data flows. This additional information can be helpful when trying to make accurate decisions while analyzing the model for security concerns, or while using the model to educate people new to the project. We include three nonstandard extension shapes for your consideration; they function as shortcuts that can make your models easier to create and understand.

An element (shown in Figure 1-1) is a standard shape that represents a process or operating unit within the system under consideration. You should always label your element, so it can be referred to easily in the future. Elements are the source and/or target for data flows (described later) to and from other units in the model. To identify human actors, use the actor symbol (refer to Figure 1-4 for a sample).

Figure 1-1. Element symbols for drawing data flow diagrams

You should also annotate each object with a description of its basic properties and metadata. You can put the annotation on the diagram itself, or in a separate document and then use the label to associate the annotation to the object.

The following is a list of potential information that you might want to capture in annotations for objects in the model:

Note

This list of potential metadata to obtain regarding an element, as annotations to the model, is not comprehensive. The information you need to know about the elements in your system depends on the methodology you eventually decide to use (see Chapters 3 through 5) as well as the threats you are trying to identify. This list presents a few of the options you may encounter.

  • Name of the unit. If it is an executable, what is it called when built or installed on a disk?

  • Who owns it within your organization (the development team, usually)?

  • If this is a process, at what privilege level is it running (e.g., always root, or setuid’d, or some nonprivileged user)?

  • If this is a binary object, is it expected to be digitally signed, and if so, by what method and/or which certificate or key?

  • What programming language(s) are used for the element?

  • For managed or interpreted code, what runtime or bytecode processor is in use?

Note

People often overlook the influence of their choice of programming language. For example, C and C++ are more prone to memory-based errors than an interpreted language, and a script will more easily lend itself to reverse engineering than a (possibly obfuscated) binary. These are important characteristics that you should understand during the design of the system, especially if you identify them while you are threat modeling, to avoid common and serious security concerns. If you don’t know this information early enough in the system development process to include it in the threat model (because, as you may know by now, threat modeling should be done early and often), this is a perfect example of why threat models should be kept up to date as the system evolves.6

There is some additional metadata, which provides context and opportunities for deeper assessment, as well as the discussion between development teams and system stakeholders, you may want to consider:

  • Is the unit production ready or a development unit or does the element only exist part-time? For example, does the unit exist only in production systems but not in development mode? This may mean that the process represented by the element may not execute or be initialized in certain environments. Or it may not be present, for example, because it is compiled out when certain compile flags are set. A good example of this is a test module or a component that only applies in a staging environment to facilitate testing. Calling it out in the threat model would be important. If the module operates through particular interfaces or APIs that are open in staging to facilitate testing, but remain open in production even though the test module has been removed, then this is a weakness that needs to be addressed.

  • Does information on its expected execution flow exist, and can it be described by a state machine or sequence diagram? Sequence diagrams can aid in identifying weaknesses, as we will discuss later in this chapter.

  • Optionally, does it use or have enabled specific flags from compilation, linking, or installation,7 or is it covered by an SELinux policy distinct from the system default? As mentioned earlier, you may not know this when you construct the first threat model, but it provides you with another opportunity to add value by keeping the threat model up to date over the course of the project.

Use the element symbol to show a self-contained unit of processing, such as an executable or a process (depending on the level of abstraction), where subdividing the element into representative components is unlikely to help people understand how the unit operates and to which threats it may be susceptible. This may take some practice—sometimes you may need to describe the subelements of the processing unit to better understand the interactions it contains. To describe subelements, you use a container symbol.

A container, or containing element, shown in Figure 1-2, is another standard shape that represents a unit within the system under consideration that contains additional elements and flows. This shape is usually used in the context layer (see “DFDs Have Levels”) of a model, highlighting the major units within the system. When you create container elements, you are signaling a need to understand the contained elements, and that the container represents the combined interactions and assumptions of all the included elements. It is also a good way to reduce the busyness of a model when drawn. Containers can be the source and/or target for data that flows to and from other model entities when present in any given level of abstraction.

Figure 1-2. Container symbols for drawing data flow diagrams

As with the element described earlier, you should assign a label to your container objects, and include metadata for the object in its annotations. Metadata should include (at least) any of the metadata items from the element as described earlier, plus a brief summary of what is contained within (i.e., the major subsystems or subprocesses that might be found inside).

Unlike an element, which represents a unit within the system under consideration, an external entity shape, shown in Figure 1-3, represents a process or system that is involved in the operation or function of the system but is not in scope for the analysis. An external entity is a standard shape. At the very least external entities offer a source for data flows coming into the system from a remote process or mechanism. Examples of external entities often include web browsers used to access web servers or similar services, but may include any type of component or processing unit.

Figure 1-3. External entity symbol for drawing data flow diagrams

Actors (see Figure 1-4), which represent primarily human users of the system, are standard shapes that have connections to interfaces offered by the system (directly, or through an intermediate external entity such as a web browser) and are often used at the context layer of the drawing.

Figure 1-4. Actor symbol for drawing data flow diagrams

The data store symbol, shown in Figure 1-5, is a standard shape representing a functional unit that indicates where “bulk” data is held, such as a database (but not always the database server). You could also use the data store symbol to indicate a file or buffer holding small amounts of security-relevant data, such as a file containing the private key to your web server’s TLS certificate,8 or for an object data store such as an Amazon Simple Storage Service (S3) bucket holding your application’s logfile output. Data store symbols can also represent a message bus, or a shared memory region.

Figure 1-5. Data store symbols for drawing data flow diagrams

Data stores should be labeled and have metadata such as the following:

Type of storage

Is this a file, an S3 bucket, a service mesh or a shared memory region?

Type and classification of data held

Is the data that is being sent to or read from this module structured or unstructured? Is it in any particular format; for example, XML or JSON?

Sensitivity or value of data

Is the managed data personal information, security relevant, or otherwise sensitive in nature?

Protections on the data store itself

For example, does the root storage mechanism offer drive-level encryption?

Replication

Is the data replicated on a different data store?

Backup

Is the data copied to another place for safety, but with potentially reduced security and access controls?

Tip

If you are modeling a system that contains a database server (such as MySQL or MongoDB), you have two choices when it comes to rendering it in a model: (a) use the data store to represent both the DBMS process and the data storage location, or (b) an element for the DBMS and a connected data store for the actual data storage unit.

Each option carries benefits and trade-offs. Most people would choose option (a), but option (b) is especially useful with effective threat analysis for cloud and embedded system models in which data may live on shared data channels or temporal storage nodes.

If an element is self-contained and has no connection to external entities, the element describes a secure, but probably pretty useless, piece of functionality within the system (hopefully, this is not your only unit within the system!). For an entity to have value, it should at least provide data or create a transformative function. Most entities also communicate with external units in some fashion. Within a system model, use the data flow symbols to describe where and how interactions are made among entities. The data flow is actually a set of symbols that represent the multiple ways system components can interact.

Figure 1-6 shows a basic line, which indicates a connection between two elements in the system. It does not, and cannot, convey any additional information, making this symbol an excellent choice when that information is not available to you at the time of the modeling exercise.

Figure 1-6. A line symbol for basic undirected data flow

Figure 1-7 shows a basic line with an arrow on one end, which is used to represent a unidirectional flow of information or action.

Figure 1-7. An arrow symbol for basic directed data flow

In Figure 1-8, the lefthand side of the image shows a basic line with arrows at both ends that represents a bidirectional communication flow. The righthand side of the image shows an alternate symbol for bidirectional communication flow. Either is acceptable, although the version on the right is more traditional and easier to recognize in a busy diagram (at the risk of making the diagram too busy as a result).

Figure 1-8. Two-headed arrows for bidirectional data flow

Figures 1-6, 1-7, and 1-8 are standard shapes in data flow diagram construction.

Note

Keep in mind that we are presenting conventions, not rules. These shapes and what they represent or how they are used in a diagram come from collective practice, not an official standard document.9 In our practice of threat modeling, we sometimes find it useful to extend the conventional shapes and metadata to better suit our requirements. You will see some of these extensions in this chapter and throughout the book. But you should be comfortable, once you are familiar with the objectives and expected outcomes of the activity, to make modifications as you see fit. Customization can make the activity, the experience, and the information gained through this activity valuable to you and the team members involved.

Figure 1-9 shows a nonstandard extension shape (see prior note) that we propose above and beyond the normal set of DFD shapes. This shape is a single-headed arrow that indicates where the communication originated. We have circled it to highlight the mark. The mark is available in engineering stencils for transmission flows in the major graphics packages.

Figure 1-9. Optional initiator mark

Data flows should have a label for reference, and you should provide the following critical metadata:

Type or nature of communication channel

Is this a network-based communication flow or a local interprocess communication (IPC) connection?

Protocol(s) in use

For example, is the data transiting over HTTP or HTTPS? If it uses HTTPS, does it rely on client-side certificates to authenticate an endpoint, or mutual TLS? Is the data itself protected in some way independently of the channel (i.e., with encryption or signing)?

Data being communicated

What type of data is being sent over the channel? What is its sensitivity and/or classification?

Order of operations (if applicable or useful for your purposes)

If flows are limited in quantity within the model, or the interactions are not very complex, it may be possible to indicate the order of operations or flow order as part of the annotations on each data flow, rather than creating a separate sequence diagram.

Note

Be careful expressing authentication or other security controls on the data flow itself. Endpoints (servers or clients) are responsible for, and/or “offer,” access controls independent of any potential data flows between them. Consider using the interface extended modeling element, described later in this section, as a “port” to simplify your drawing and facilitate a more effective analysis for threats.

Keep the following considerations in mind when using data flows in your models.

First, use arrows to indicate the direction of data flows in your diagram and in your analysis. If you have a line that starts at element A and goes to element B where it terminates in an arrow (as shown in Figure 1-7), it indicates the flow of meaningful communications goes from A to B. This is the exchange of data that is of value to the application, or to an attacker, but not necessarily individual packets and frames and acknowledgments (ACKs). Likewise, a line starting at B and ending in an arrow at A would mean communication flows from B to A.

Second, you can choose from two basic approaches to show bidirectional communication flows in your model: use a single line with an arrow at each end, as shown in Figure 1-8 (left), or use two lines, one for each direction, as shown in Figure 1-8 (right). The two-line method is more traditional, but they are functionally equivalent. The other benefit of using the two-line method is that each communication flow may have different properties, so your annotations may be cleaner in the model using two lines instead of one. You can choose to use either method, but be consistent throughout your model.

Lastly, the purpose of a data flow in a model is to describe the primary direction of travel of communications that is relevant for the purposes of analysis. If a communication path represents any standard protocol based on Transmission Control Protocol (TCP) or User Datagram Protocol (UDP), packets and frames pass back and forth along the channel from source to destination. But this level of detail is usually not important for threat identification.

Instead, it is important to describe that application-level data or control messages are being passed on the established channel; this is what the data flow is meant to convey. However, it is often important to understand for analysis which element initiates the communication flow. Figure 1-9 shows a mark you can use to indicate the initiator of the data flow.

The following scenario highlights the usefulness of this mark in understanding the model and analyzing the system.

Element A and element B are connected by a unidirectional data flow symbol, with data flowing from A to B, as shown in Figure 1-10.

Figure 1-10. Sample elements A and B

Element A is annotated as service A, while element B is the logger client. You might come to the conclusion that B, as the recipient of data, initiated the communication flow. Or, you may alternatively conclude that A initiated the data flow, basing your analysis on the label for each endpoint. In either case, you may be correct, because the model is ambiguous.

Now, what if the model contains the additional initiator mark, attached to the endpoint at element A? This clearly indicates that element A, not element B, initiates the communication flow and that it pushes data to B. This may happen in cases you are modeling; for example, if you were modeling a microservice pushing log information to a logger client. It is a common architectural pattern, shown in Figure 1-11.

Figure 1-11. Sample elements A and B, with initiator mark at element A

However, if the initiator mark is placed on B rather than A, you would reach a different conclusion on the potential threats with this model segment. This design would reflect an alternate pattern in which the logger client, being perhaps behind a firewall, needs to communicate outbound to the microservice instead of the other way around (see Figure 1-12).

Figure 1-12. Sample elements A and B, with initiator mark at element B

The symbol shown in Figure 1-13 is traditionally used to delineate a trust boundary: any elements behind the line (the curvature of the line determines what is behind the line versus in front) trust one another. Basically, he dotted line identifies a boundary where all of the entities are trusted at the same level. For example, you might trust all processes that run in behind your firewall or VPN. This does not mean flows are automatically unauthenticated; instead a trust boundary means that objects and entities operating within the boundary operate at the same trust level (e.g., Ring 0).

This symbol should be used when modeling a system in which you wish to assume symmetric trust among system components. In a system that has asymmetric component trust (that is, component A may trust component B, but component B doesn’t trust component A), the trust boundary mark would be inappropriate, and you should use an annotation on the data flow with information describing the trust relationship.

Figure 1-13. Trust boundary symbols for drawing data flow diagrams

The same symbol is also sometimes used, as shown in Figure 1-14, to indicate the security protection scheme on a particular data flow, such as marking the data flow as having confidentiality and integrity through the use of HTTPS. An alternative to this symbol and annotation, which could lead to a lot of clutter in models with a significant number of components and/or data flows, is to provide an annotation to the data flow itself.

The necessary metadata for a trust boundary, if used in the traditional sense (to denote a boundary beyond which all entities are of the same trust level), is a description of the symmetrical trust relationship of the entities. If this symbol is used to indicate a control on a channel or flow, the metadata should include the protocol(s) in use (e.g., HTTP or HTTPS, mutual TLS or not), the port number if not the default, and any additional security control information you wish to express.

Figure 1-14. Symbol for an annotated trust boundary

An interface element, circled in Figure 1-15, is another nonstandard extension shape that indicates a defined connection point for an element or container. This is useful for showing ports or service endpoints exposed by the element. This is especially helpful when the specific use of the endpoint is undefined or indeterminate at design time, or in other words, when the clients of the endpoint are unknown ahead of time, which means drawing a particular data flow is difficult. While this may seem a trivial concern, an open listening endpoint on a service can be a major source of architectural risk, so having the ability to recognize this from the model is key.

Figure 1-15. Interface element symbol

Each interface should have a label and metadata describing its core characteristics:

  • If the interface represents a known port, indicate the port number.

  • Identify the communications channel or mechanism—e.g., PHY or Layer 1/Layer 2: Ethernet, VLAN, USB human interface device (HID), or a software-defined network—and whether the interface is exposed externally to the element.

  • Communication protocol(s) offered by the interface (e.g., protocols at Layer 4 and above; or TCP, IP, HTTP).

  • Access controls on incoming connections (or potentially outbound data flows), such as any type of authentication (passwords, or SSH keys, etc.) or if the interface is impacted by an external device such as a firewall.

Knowing this information can make analysis easier, in that all data flows connecting to the interface can inherit these characteristics. Your drawing will also be simpler and easier to understand as a result. If you don’t want to use this optional element, create a dummy entity and data flow to the open service endpoint, which can make the drawing appear more complex.

Warning

The shape in Figure 1-16—the block—is not part of the accepted collection of DFD shapes. It is included here because Matt finds this useful and wanted to demonstrate that threat modeling need not be bound solely to the traditional stencil when there is an opportunity to add value and/or clarity to one’s models.

A block element, shown in Figure 1-16, represents an architectural element that selectively alters the data flow on which it is attached. Blocks can also modify a port at the connection of a data flow or process boundary. This shape highlights when a host firewall, another physical device, or a logical mechanism as a function of the architecture, and important for analysis, exists. Block elements can also show optional or add-on equipment that may impact the system in a way outside the control of the project team, yet are not external entities in the traditional sense.

Figure 1-16. Block symbols

Metadata that you should collect for blocks include the usual labels as well as the following:

The type of block

A physical device or logical unit, and whether the unit is an optional component for the system.

Behavior

What the block does and how it may modify the flow or access to the port or process. Use a sequence diagram or state machine to provide additional detail on the behavior modification supported by the unit the block represents.

Tip

When developing a model of your system, always be sure to decide whether you and the project team will use a particular symbol, and whether you decide to alter its meaning (effectively making your own house rules for threat modeling, which is perfectly acceptable!). Be consistent with the symbol’s use, which will result in the activity being effective and showing value.

Sequence Diagrams

While DFDs show interactions and interconnections among system components and how data moves between them, sequence diagrams show a time or event-based sequence of actions. Sequence diagrams come from the UML, and are a specialization of the interaction diagram type in UML. Supplementing DFDs with sequence diagrams as part of modeling in preparation for threat analysis can be instrumental in providing necessary context about the way your system behaves and any temporal aspects required for proper analysis. For example, DFDs can show you that a client communicates with a server and passes some form of data to the server. A sequence diagram will show you the order of operations used in that communication flow. This may reveal important information, such as who initiates the communication and any steps in the process that may introduce security or privacy risk, such as a failure to implement a protocol correctly or some other weakness.

There has been some discussion in the security community as to whether the sequence diagram is actually more important for performing this activity than development of DFDs. This is because a properly constructed sequence diagram can provide significantly more useful data than DFDs. The sequence diagram not only shows what data is involved in a flow and which entities are involved, but also explains how the data flows through the system, and in what order. Flaws in business logic and protocol handling are therefore easier to find (and in some cases are the only possible way to find) with a sequence diagram.

Sequence diagrams also highlight critical design failures such as areas that lack exception handling, or failure points or other areas where security controls are not consistently applied. It can also expose controls that are suppressed or inadvertently defeated, or potential instances of race conditions—including the dreaded time of check time of use (TOCTOU) weakness—where simply knowing that data flows, but not the order in which it flows, does not identify these weaknesses. Only time will tell if using sequence diagrams as an equal partner in threat modeling becomes popular.

The formal definition of a sequence diagram in UML includes a significant number of modeling elements, but for the purposes of creating a model suitable for threat analysis, you should be concerned only with the following subset.

Figure 1-17 shows a sample sequence diagram simulating a potential communication and call flow of a mythical system.

Figure 1-17. Sequence diagram shapes

The modeling elements shown in Figure 1-17 include the following:

Entities (objects A and B)

Within the scope of the system being considered, and their “lifeline” for connecting to interactions with other entities.

Actors (humans)

Not represented here, but they reside externally to the system components and interact with the various entities within the system.

Messages

Messages containing data (“Call A”, “Return B”) being passed from one entity to another. Messages may be synchronous or asynchronous between entities; synchronous messages (represented by solid arrowheads) block until the response is ready, while asynchronous messages (represented by open arrowheads, not shown) are nonblocking. Dashed lines ending in arrow heads represent return messages. Messages may also initiate and terminate from an entity without passing to another entity, which is represented by an arrow that circles back on the lifeline for the entity from which it initiated.

Conditional logic

This may be placed on message flows to provide constraints or preconditions, which help identify problems introduced by business logic flaws and their impact on data flows. This conditional logic (not shown in Figure 1-17) would have the form of [condition] and would be placed inline with the message label.

Time

In a sequence diagram, time flows from top to bottom: a message higher up in the diagram occurs sooner in time than the messages that follow.

Constructing a sequence diagram is fairly easy. The hard part is deciding how to draw one. We recommend that you find a good drawing tool that can handle straight lines (both solid and dashed), basic shapes, and arrows that can curve or bend. Microsoft Visio (and any of the Libre or open alternatives such as draw.io or Lucidchart) or a UML modeling tool like PlantUML should do fine.

You will also need to decide what actions you plan to model as a sequence. Good choices include authentication or authorization flows, as these involve multiple entities (at least an actor and a process, or multiple processes) exchanging critical data in a predefined manner. You can successfully model interactions involving a data store or asynchronous processing module as well as any standard operating procedure involving multiple entities.

Once you have decided on the actions you want to model, identify the interaction and operation of elements within your system. Add each element to the diagram as a rectangle toward the top of the diagram (shown in Figure 1-17), and draw a long line straight downward from the lower center of the element’s rectangle. Finally, starting toward the top of the diagram (along the long vertical lines), use lines ending in arrows in one direction or another to show how the elements interact.

Continue to describe interactions, moving further down the model, until you reach the natural conclusion of interactions at the expected level of granularity. If you are using a physical whiteboard or similar medium to draw your model and take notes, you may need to continue your model across multiple boards, or take a picture of the incomplete model and erase it to continue going broader and deeper in your modeling. You would then need to stitch the pieces together later to form a complete model.

Process Flow Diagrams

Traditionally used in process design and chemical engineering, process flow diagrams (PFDs) show the sequence and directionality offlow of operations through a system. PFDs are similar to sequence diagrams, but are generally at a higher level, showing the activity chain of events in the system rather than the flow of specific messages and component state transitions.

We mention process flows here for completeness, but the use of PFDs in threat modeling is not common. The ThreatModeler tool uses PFDs as its primary model type, however, so some may find it of value.

PFDs may be complementary in nature to sequence diagrams. You can sometimes describe the activity chain from a PFD with a sequence diagram using labels that indicate which segments of message flow are bound to a specific activity or event. Figure 1-18 shows a PFD for the events of a simple web application.

Figure 1-18. Sample process flow diagram

Figure 1-19 shows the same PFD redrawn as a sequence diagram with activity frames added.

Figure 1-19. Sequence diagram as PFD

Attack Trees

Attack trees have been used in the field of computer science for more than 20 years. They are useful for understanding how a system is vulnerable by modeling how an attacker may influence a system. Attack trees are the primary model type in threat analysis when using an attacker-centric approach.

This type of model starts at the root node that represents the goal or desired outcome. Remember, in this model type the result is a negative outcome for the system owners, but a positive outcome for the attackers! The intermediate and leaf nodes represent possible ways of achieving the goal of the parent node. Each node is labeled with an action to be taken, and should include information such as the following:

  • The difficulty in performing the action to accomplish the parent node’s goal

  • The cost involved to do so

  • Any special knowledge or conditions required to allow the attacker to succeed

  • Any other relevant information to determine overall capability for success or failure

Figure 1-20 shows a generic attack tree with a goal and two actions and two subactions an attacker uses to reach the goal.

Figure 1-20. A generic attack tree diagram

Attack trees, which can be valuable for threat analysis, and for understanding the actual level and degree of risk to a system from attackers, need a couple of things to be well constructed and to provide the correct analysis of impact:

  • Complete knowledge of how something can be compromised—favoring completeness and “what is possible” over “what is practical”

  • An understanding of the motivations, skills, and resources available to different types and groups of attackers

You can construct an attack tree relatively easily, using the following steps:

  1. Identify a target or goal for an attack.

  2. Identify actions to be taken to achieve the target or goal.

  3. Rinse and repeat.

Identify a target or goal for an attack

For this example, let’s say that an attacker wants to establish a persistent presence on a system via remote code execution (RCE) on an embedded device. Figure 1-21 shows what this might look like in an evolving attack tree.

Figure 1-21. Sample attack tree, step 1: identify high-level target or goal

Identify actions to be taken to achieve the target or goal

How do you get to RCE on this system? One way is to find an exploitable stack buffer overflow and use it to deliver an executable payload. Or you could find a heap overflow and use it in a similar fashion. At this point, you might be thinking, “But wait, we don’t know anything about the system to know if this is even feasible!” And you are right.

When performing this exercise in real life, you want to be realistic and make sure you identify only targets and actions that make sense to the system under evaluation. So for this example, let’s assume that this embedded device is running code written in C. Let’s also make the assumption that this device is running an embedded Linux-like operating system—either a real-time operating system (RTOS) or some other resource-constrained Linux variant.

So what might be another action needed to gain RCE capability? Does the system allow a remote shell? If we assume this device has flash memory and/or bootable media of some kind, and can accept over-the-air updates (OTAs), we can add file manipulation and OTA firmware spoofing or modification as actions to achieve RCE as well. Any possible actions you can identify should be added as elements to the attack tree, as shown in Figure 1-22.

Figure 1-22. Sample attack tree, step 2: identify actions required to achieve the goal

Rinse and repeat

Here is where it really gets interesting! Try to think of ways to achieve the next order of outcomes. Don’t worry about feasibility or likelihood; analysis and decisions made from such analysis will happen later. Think outside the box. Remember, you’re putting on your hacker hat, so think like they would. No matter how crazy your ideas are, someone might try something similar. At this stage, an exhaustive list of possibilities is better than a partial list of feasibilities.

Your tree is done when no additional substeps are needed to complete an action. Don’t worry if your tree looks lopsided; not all actions need the same level of complexity to achieve results. Also don’t worry if you have dangling nodes—it may not be easy to identify all possible scenarios for an attacker to achieve a goal (it’s good to think of as many scenarios as you can, but you might not be able to identify all of them). Figure 1-23 shows an evolved (and possibly complete) attack tree indicating the methods by which an attacker may reach their goal.

Figure 1-23. Sample attack tree, step 3 and beyond: identify subactions to achieve subtargets

Learning how to break something or accomplish prerequisite goals is easier as a group brainstorming exercise. This lets individuals with technical and security knowledge add their expertise to the group so that you can identify all the attack tree’s possible nodes and leaves. Understanding your organization’s risk appetite, or the amount of risk your organization is willing to accept, will clarify how much time you should spend on the exercise and if the organization is willing to take the necessary actions to address any concerns identified.

Knowing how attackers behave is a significant challenge for most businesses and security practitioners, but community resources such as the MITRE ATT&CK framework make identification and characterization of threat actors’ techniques, skills, and motivations much easier. It is certainly not a panacea, as it is only as good as the community that supports it, but if you are unfamiliar with how attacker groups behave in the real world, this blog entry by Adam Shostack, summarizing a talk by Jonathan Marcil, is an excellent resource for you to consider.

Fishbone Diagrams

Fishbone diagrams, also known as cause-and-effect, or Ishikawa, diagrams, are used primarily for root cause analysis of a problem statement. Figure 1-24 shows an example of a fishbone diagram.

Similar to attack trees, the fishbone diagrams can help you identify weaknesses in the system for any given area. These diagrams are also useful for identifying pitfalls or weaknesses in processes such as those found in the supply chain for a system where you may need to analyze component delivery or manufacturing, configuration management, or protection of critical assets. This modeling process can also help you understand the chain of events that lead to exploitation of a weakness. Knowing this information allows you to construct better DFDs (by knowing what questions to ask or what data you seek), and identify new types of threats as well as security test cases.

Constructing a fishbone diagram is similar to creating attack trees, except instead of identifying a target goal and the actions to achieve the goal, you identify the effect you want to model. This example models the causes of data exposure.

First, define the effect you want to model; Figure 1-24 demonstrates the technique with data exposure as the effect to model.

Figure 1-24. Sample fishbone diagram, step 1: main effect

Then you want to identify a set of primary causes that lead to the effect. We’ve identified three: overly verbose logs, covert channels, and user error, as shown in Figure 1-25.

Figure 1-25. Sample fishbone diagram, step 2: primary causes

Finally, you identify the set of causes that drive the primary causes (and so on). We have identified that a primary cause for user error is a confusing UI. This example recognizes only three threats, but you will want to create larger and more expansive models, depending on how much time and effort you wish to expend versus the granularity of your results. Figure 1-26 shows the fishbone diagram in a complete state, with the expected effect, primary, and secondary causes.

Figure 1-26. Sample fishbone diagram, step 3: secondary causes

How to Build System Models

The basic process for creating system models starts by identifying the major building blocks in the system—these could be applications, servers, databases, or data stores. Then identify the connections to each major building block:

  • Does the application support an API or a user interface?

  • Does the server listen on any ports? If so, over what protocol?

  • What talks to the database, and whatever communicates to it, does it only read data, or does it write data too?

  • How does the database control access?

Keep following threads of conversation and iterate through every entity at this context layer in the model until you have completed all necessary connections, interfaces, protocols, and data streams.

Next, choose one of the entities—usually an application or server element—that may contain additional details you need to uncover in order to identify areas for concern, and break it down further. Focus on the entry and exit points to/from the application, and where these channels connect, when looking at the subparts that make up the application or server.

Also consider how the individual subparts may communicate with each other, including communication channels, protocols, and the type of data passed across the channels. You will want to add any relevant information based on the type of shape added to the model (later in the chapter you will learn about annotating the model with metadata).

When building models, you will need to leverage your judgment and knowledge of security principles and technology to gather information to enable a threat assessment. Ideally, you would perform this threat assessment immediately after your model is built.

Before you begin, decide which model types you may need and the symbol set for each model type you intend to use. For example, you may decide to use the DFD as your primary model type but use the default symbol set defined by whatever drawing package you are using. Or you may decide to also include sequence diagrams, which would be appropriate if your system uses nonstandard interactions between components where exploitable weaknesses can hide.

As the leader of a modeling exercise (which, for the purposes of this chapter, we assume is you—lucky you), you need to make sure you include the right stakeholders. Invite the lead architect, the other designers, and the development lead(s) to the modeling session. You should also consider inviting the quality assurance (QA) lead. Encourage all members of the project team to provide their input to the construction of the model, although as a practical matter, we recommend keeping the attendee list to a manageable set to maximize the time and attention of those who do attend.

If this is the first time you or your development team are creating a system model, start slowly. Explain the goals or expected outcomes of the exercise to the team. You should also indicate how long you expect the exercise to take, and the process that you will follow, as well as your role in the exercise and the role of each stakeholder. In the unlikely event that team members are not all familiar with each other, go around the room to make introductions before you begin the session.

You should also decide who is responsible for any drawing and note-taking required during the session. We recommend you do the drawing yourself because it puts you in the center of the conversation at all times and provides attendees an opportunity to focus on the task at hand.

A few points are worth mentioning as you explore the system:

Timing of the exercise is important

If you meet too early, the design will not be formed sufficiently, and a lot of churn will occur as designers with differing viewpoints challenge each other and take the discussion off on tangents. But if you meet too late, the design will be fixed, and any issues identified during threat analysis may not be resolved in a timely fashion, making your meeting a documentation exercise rather than an analysis for threats.

Different stakeholders will see things differently

We have found it common, especially as the attendee count increases, that stakeholders are not always on the same page when it comes to how the system was designed or implemented; you need to guide the conversation to identify the correct path for the design. You may also need to moderate the discussion to avoid rabbit holes and circling conversation threads, and be wary of sidebar conversations, as they provide an unnecessary and time-consuming distraction. A well-moderated conversation among stakeholders in the system-modeling process often leads to “eureka!” moments, as the discussion reveals that the expectation from the design and the reality of the implementation clash, and the team can identify the spots where constraints modified the initial design without control.

Loose ends are OK

As we mentioned previously, while you may strive for perfection, be comfortable with missing information. Just make sure to avoid or minimize knowingly incorrect information. It is better to have a data flow or element in the model that is filled with question marks than it is to have everything complete but some known inaccuracies. Garbage in, garbage out; in this case, the inaccuracies will result in poor analysis, which may mean multiple false findings, or worse, a lack of findings in a potentially critical region of the system.

We recommend that you present system modeling as a guided exercise. This is especially important if the project team is unfamiliar with the model construction process. It is often beneficial for someone external to the product development team to facilitate the modeling exercise because this avoids a conflict of interest with respect to the system design and its potential impact on delivery requirements.

This is not to say that someone facilitating the construction of a model should be totally impartial. The leader will be responsible for gathering the necessary participants and working with this team to define the system the team intends to build with sufficient detail to support later analysis. As such, the leader should be an enabler of outcomes, not a disinterested third party. They do need to be removed enough from the design (and assumptions or shortcuts made or risks ignored) to provide a critical look at the system and be able to tease out tidbits of information that will be useful for the threat analysis.

As a leader, it’s important that you have accurate and complete information, as much as possible, when analyzing your model; your analysis may lead to changes to the system design, and the more accurate the information you start with, the better analysis and recommendations you can make. Keep an eye on the detail and be willing and able to overturn “rocks” to find the right information at the right time. You should also be familiar with the technologies under consideration, the purpose of the system being designed, and the people involved in the exercise.

While you don’t need to be a security expert to build a good system model, model building is usually conducted as a prerequisite to the threat analysis phase. This usually happens in rapid succession, which suggests you should probably be the security leader for that part of the project as well. The reality is that, with modern development projects, you may not be an expert in everything involved with a system. You have to rely on your teammates to shore up your knowledge gaps, and act more as a facilitator to ensure that the team efficiently develops a representative and accurate model. Remember, you don’t need to be an auto mechanic to drive a car, but you do need to know how to drive your car and the rules of the road.

Tip

If you are the leader charged with delivering a system model for analysis, you should be OK with imperfection, especially when starting a new model of a system. You will have an opportunity to improve the model over successive iterations.

No matter how skilled you are at drawing models, or interrogating designers about the systems they present to you, it is highly likely that the information you need in its entirety will be missing or unavailable, at least initially. And that is fine. System models represent the system under consideration and do not need to be 100% accurate to be of value. You must know some basic facts about the system and each element in the system for you to be effective in your analysis, but do not try for perfection or you will be discouraged (unfortunately, we know this from experience).

You can improve your chances of success in leading this activity by keeping in mind a few simple things:

Establish a blame-free zone

Individuals with a strong attachment to a system being analyzed will have opinions and feelings. While you should expect professionalism from attendees, contention and heated arguments may create sour working relationships if you don’t avoid getting personal in a system-modeling session. Be prepared to moderate the discussion to prevent singling out individuals for mistakes, and redirect the conversation to recognizing the great learning opportunity you now have.

No surprises

Be up front about what you intend to accomplish, document your process, and give your development teams plenty of notice.

Training

Help your team help you by showing them what needs to be done and what information will be required of them so that they can be successful. Hands-on training is especially effective (e.g., “show one, do one”), but in this age of video logs (vlogs) and live-streaming, you may also consider recording a live modeling session Critical Role-style and making the video available for your development teams to review. This could be the best two to three hours of time spent in training.

Be prepared

Ask for information about the target system ahead of your system-modeling exercise, such as system requirements, functional specifications, or user stories. This will give you a sense of where the designers might go when considering a set of modules, and help you to frame questions that can help obtain the necessary level of information for a good model.

Motivate attendees with food and drink

Bring donuts or pizza (depending on the time of day) and coffee or other snacks. Food and drink goes a long way toward building trust and getting attendees to discuss hard topics (like that big security hole that was introduced by accident!).

Gain buy-in from leadership

Attendees will feel more comfortable being present and sharing their thoughts and ideas (and uncovering skeletons in the closet, so to speak) if they know their management team is on-board with this activity.

Note

At the time of this writing, the COVID-19 pandemic is making us think creatively about how to meet safely and build virtual comradery with shipped (or locally sourced) snacks and group video calls. These are lessons you can apply to distributed team collaboration any time.

When you create a system model, regardless of the type, you may choose to draw it out on a whiteboard or in a virtual whiteboard application and translate it into your favorite drawing package. But you don’t have to always do it by hand. Know that online and offline utilities are available today10 that enable you to create models without manually drawing them first.

If you use any of these drawing packages, you should come up with your own method of adding metadata annotations for each element as described earlier. You can do this within the diagram itself as a textbox or callout, which might clutter the diagram. Some drawing applications perform automatic layout of objects and connections, which on complex diagrams can look like spaghetti. You can also create a separate document, in your favorite text editor, and provide the necessary metadata for each element shown in the diagram. The combination of diagram(s) and text documents becomes the “model” that allows a human to perform analysis that can identify threats and weaknesses.

What Does a Good System Model Look Like?

Despite your best efforts, complexity may occur because you have too much information, or worse, incorrect information. Sometimes, the potential level of detail in the model itself and subsequent amount of effort you need to perform analysis on the model is a welcome diversion from all the fires you’re fighting. Alternatively, an extreme level of detail might be a requirement of your environment or market segment. For example, some industries, such as transportation or medical devices, require a higher degree of analysis to address a higher degree of assurance. For most of us, however, threat modeling is often seen as unfamiliar, unnerving, or an unwelcome “distraction” from other seemingly more critical tasks. But by now you already know: a good threat model will pay for itself.

But what makes a good model? It depends on various factors, including the methodology you use, your goals, and how much time and energy you can to devote to building out the model. While a good model is difficult to describe, we can highlight key points that form a good system model. Good models at a minimum have the following properties:

Accurate

Keep your models free of inaccurate or misleading information that will result in an imperfect threat analysis. This is hard to do alone, so it is critical to have support from the system designers, developers, and others on the project. If the project team wonders aloud “What is that?” when everything is said and done, something bad happened during the system model’s construction and should be revisited.

Meaningful

Models should contain information, not just data. Remember that you are trying to capture information that points to “conditions for potential compromise” within your system. Identifying those conditions depends on the threat modeling methodology that you ultimately select. The methodology you use identifies whether you are looking for only exploitable weaknesses (aka vulnerabilities) or want to identify different parts of the system that have the potential to contain weaknesses, exploitable or not (because in theory they will likely become exploitable in practice while not so on paper).

Sometimes people want to capture as much metadata about the system as possible. But the point of modeling is to create a representation of the system without re-creating it, providing sufficient data to make inferences and direct judgments on the characteristics of the system.

Representative

The model should attempt to be representative of either the design intentions of the architect or the realized implementation by the development teams. The model can tell us what to expect from the system’s security posture as designed or as implemented, but usually not both. Either way, the conversation around the conference room table will be the corporate equivalent of “he said, she said.” The team should clearly recognize their system in the model created.

Living

Your system isn’t static. Your development team is always making changes, upgrades, and fixes. Because your systems are always changing, your models need to be living documents. Revisit the model on a regular basis to ensure it remains accurate. It should reflect the currently expected system design or the current system implementation. Model “what is” instead of “what should be.”

Deciding when your model is “good” is not easy. To determine the quality and “goodness” of a system model, you should develop guidelines and make them available to all participants. These guidelines should spell out which modeling constructs (i.e., shapes, methods) to use and for what purposes. They should also identify the granularity level to strive for and how much information is too much. Include stylistic guidance such as how to record annotations or use color in the model diagram.

The guidelines are not rules, per se. They are there to provide consistency in the modeling exercise. However, if team members deviate from the guidelines but are effective in developing a quality model, take them all out for a drink. Declare success for the first model created by a team when the participants—the designers and other stakeholders of the system, and yourself—agree that the model is a good representation of what you want to build. Challenges may remain, and the stakeholders may have reservations about their creation (the system, not the model), but the team has cleared the first hurdle and should be congratulated.

Summary

In this chapter, you learned a brief history of creating models of complex systems and the types of models commonly used in threat modeling. We also highlighted techniques that will help you and your team get the right amount of information into your models. This will help you find the needles (data) in the haystack of information while also avoiding analysis paralysis.

Up next, in Chapter 2, we present a generalized approach to threat modeling. In Chapter 3, we’ll cover a collection of industry-accepted methodologies for identifying and prioritizing threats.

1 “The Plan of St. Gall,” Carolingian Culture at Reichenau and St. Gall, https://oreil.ly/-NoHD.

2 A. E. Dien, Six Dynasties Civilization (New Haven: Yale University Press, 2007), 214.

3 A. Smith, Architectural Model as Machine (Burlington, MA: Architectural Press, 2004).

4 There are other methods of producing graphical models suitable for analysis, such as using other UML model types, or the System Modeling Language (SysML), and other model types that may be useful for performing an effective analysis, such as control flow graphs and state machines. But those methodologies are beyond the scope of this book.

5 “Data Flow Diagrams (DFDs): An Agile Introduction,” Agile Modeling Site, https://oreil.ly/h7Uls.

6 For an extensive discussion of the subject, see Brook S.E. Schoenfield, Securing Systems: Applied Security Architecture and Threat Models (Boca Raton, FL: CRC Press, 2015).

7 Common flags include for ASLR or DEP support or stack canaries.

8 As in Apache Tomcat’s use of this mechanism.

9 Adam Shostack, “DFD3,” GitHub, https://oreil.ly/OMVKu.

10 draw.io, Lucidchart, Microsoft Visio, OWASP Threat Dragon, and Dia, to name a few.

Get Threat Modeling now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.