Chapter 4. Securing Traffic

A comprehensive API traffic management system includes robust security features. This means a reliable authentication system as well as a scalable authorization strategy. Each aspect of security (authentication and authorization) is essential for a healthy API ecosystem. In this chapter, we cover API security basics such as API keys, authentication, authorization, and encryption.

Access control (or authorization) is a particularly important security element in API systems that rely on microservices. As your service collection grows and becomes more adaptable at runtime, it becomes increasingly difficult to know—ahead of time—just which services your request is likely to encounter. We devote some additional time in this chapter on designing and implementing a scalable and reliable authorization system based on access tokens.

Security Basics

The basics of API security (see Figure 4-1) center on authentication (the requesting identity) and authorization (the identity’s access controls for this request). API keys are another important element of API security because they help identify API usage independent of the requesting identity. There is also the matter of data encryption for messages in transit.

Also, a robust security implementation is able to deal with identity and access control between separate systems. For example, when APIs from your own system need to access services from an external API ecosystem such as Salesforce, SAP, and other so-called third-party services, your API traffic management needs to be able to negotiate identity and support access controls between your own API ecosystem and that of other, external systems.

atrm 0401
Figure 4-1. API traffic security basics

API Keys

API keys are a simple, low-level way to track and control how an API is used. All good API traffic management systems have a way to generate and track API key usage. The actual format of the API key is not very important—often they are just a universally unique identifier (UUID).

Warning

It is important to not use API keys as authentication or authorization keys. API keys are just a way to control access to the API, not proof of identity or access rights. To manage identify and access, you need other elements of your API security platform.

No matter how they are created, API keys need to be passed as part of any API request. That allows API proxies and gateways to track the appearance of these keys, validate the key against a list of authorized keys, and log them for monitoring and analysis. Typically, if you don’t have a valid API key, your request is rejected by the API traffic management platform. Good API platforms allow you to cancel or revoke an API key if you discover any sign of abuse, too.

It is also important to keep in mind that API keys are not authentication or access control tokens. Because API keys are usually static strings that contain no identifying information themselves, they are not the same as authentication or authorization tokens. We cover authentication and authorization next.

Identity/Authentication

After your system validates a request using an API key, the next step is to confirm the requester’s identity. That might be a human operating an application that calls the API, or it might be another internal service calling the API on its own behalf or as part of a more involved set of calls to solve a particular problem.

End-user API calls are often authenticated using some form of mutual authentication such as a security certificate or a three-legged authentication protocol (more on this in a moment) such as OpenID and OAuth. In all cases, both the client and the server need to share some identifying information (usernames/passwords, certificates, etc.). This means that there is some setup needed ahead of time—before the first API request—to make sure both parties trust each other at runtime.

The OAuth protocol is a common API authentication protocol because it offers quite a number of authentication flows (called Grant Types) to fit various levels of security and interaction needs. An important feature of OAuth is that it supports what is called three-legged authentication. This means that the actual authentication event happens at a provider service, which provides an authentication token that is then used by the client (application or API) to present to the actual service. Although this is bit more complicated than simple username/password or certificate authentication, three-legged models make it possible for people to build API consumer applications that do not ever see the requester’s authentication credentials.

Your API traffic platform should make it possible to manage and support multiple authentication schemes without any change to the related API or the services behind that API. It should also make it easy to collect logs and related information at the authentication level given that this is your first line of defense when it comes to recognizing intrusions.

Identifying the API requester is just part of the job of completing a secure API transaction. It is also important to understand the access control limits each request has for any services it attempts to contact.

Access Control/Authorization

Knowing the identity of the entity initiating the request is just the start of the process when engaging in secure API transactions. The next step is establishing the access control rights that identity has for the life of the API request. This is usually called authorization.

The act of authorizing a request is, essentially, associating access rights to the request. Access control can be applied and validated a couple of different ways. For example, we can associate identities with roles (e.g., admin, manager, user, guest, etc.). You can apply access control directly to identities, too (e.g., margaret_hamilton, frederick_jones, etc.).

A reliable API traffic platform is able to quickly and easily associate validated identities with the proper roles for existing services. This work is usually not a problem for ecosystems for which there is a limited number of services and all the users are managed with the ecosystem (e.g., local user accounts). However, as the number of services and/or the number of roles within an ecosystem increases, it becomes difficult to scale access control. In the next section (“Managing Access with Tokens”), we look deeper into how to deal with this scaling challenge to your API ecosystem.

Encryption Considerations

Encrypting traffic offers a level of security for messages as they pass from ecosystem to ecosystem and between components within the same ecosystem. A good API traffic program includes the ability to employ message-level encryption and, if needed, field-level encryption.

The most common message-level encryption implementation is to use Transport Layer Security (TLS). The goal of TLS is to prove what is called a “secure channel” between two parties. TLS requires a bit of setup or handshaking to establish the identities of each party and a mutual encryption scheme. When that is done, the two parties use the same encryption details for each message they pass back and forth.

It is also possible to send messages “in the clear” and include field-level encryption. In this case, the data in sensitive fields such as personally identifying information (PII) is encrypted using a shared key (one that both the sender and the receiver already know ahead of time). Field-level encryption requires additional setup between parties and is challenging to scale. Often field-level encryption is handled by data storage systems (databases, filesystems, etc.) and is not something API traffic platforms need to deal with directly.

Managing Authentication Risk

Another important aspect to authentication is quantifying the risk associated with a particular authenticated identity. To do that, you need to know not just the identity of the requesting party (“Hi, my name is Mike!”), you also need to know several other things:

  • How the identity was authenticated (username/password, active directory, Facebook, etc.)

  • The device used to log in (mobile app, desktop, VM in the head end, etc.)

  • The location of the authentication point (e.g., geo-code information)

You can use these factors (method, device, location, along with actual identity) to create a risk score associated with the authentication. For example, “Mike logged in using a valid certificate from a building on the company campus using his company laptop.” This probably merits a relatively low-risk scoring. However, if the scenario were “Mike logged in using a social media account from Singapore using a mobile device we’ve never seen before,” this deserves a high-risk score and might mean adding a two-factor authentication (2FA) to continue or might simply mean denying the login, alerting the security team, and locking the account to prevent further attempts to log in.

Your API traffic management program should support some form of risk scoring and mitigation to protect your system, your data, and your users.

Mediating External Security Systems

One more topic worth discussing when reviewing API security basics is the work of mediating security details between separate systems. As APIs become more ubiquitous in enterprises, it is increasingly likely that your API platform depends on the services of other, third-party APIs over which you have no control. And many of these external third-party APIs have their own security details including authentication and authorization requirements.

When an API client is making a call to another service that resides outside your ecosystem, they need to supply the proper security credentials. In many cases, these credentials are not the same ones used within your own API ecosystem. Thus, there is a need for a mediating layer to handle the security “hand-off” between systems.

The simplest approach is to use a single, shared set of credentials when making calls to a third-party API. The advantage is that only the component that makes the third-party call needs to know that third party’s credentials. The downside is that this sharing of credentials results in a built-in privilege escalation. To avoid this, a good API traffic platform will provide a way to associate each API call with the appropriate level of security when reaching out to third-party APIs. This improves the overall security of your platform and provides more accurate usage and monitoring data when reporting on your API traffic both within and beyond your own ecosystem.

Managing Access with Tokens

The work of dealing with access control (or authorization) can be a challenge. This is especially true for organizations that 1) implement a microservice style behind the APIs, 2) have lots of external users (e.g., identities not managed within your own Lightweight Directory Access Protocol [LDAP] or Active Directory), or 3) depend on third-party/external APIs in order to complete requests.

In a monolithic environment in which all user identities are managed by the local user directory, where there is a small number of services, and where there are no external services to access, it can be sufficient to assign every identity a single set of roles (“guest,” “payroll,” “sales,” “sysadmin,” etc.) for the life of all requests for that identity. However, as the number of services increases, as you add external users (e.g., partners, end users), and you add third-party API usage, assigning one set access control profile for each identity becomes difficult to manage and scale.

In this section, we talk about the importance of adopting an access token approach for authorization as well as the two ways to grant an identify access control: grants by value and grants by reference. A good API traffic management program supports both approaches, and with just a bit of planning and design of your actual tokens, you can easily switch between implementations at runtime.

The JWT Specification

JWT, or JSON Web Token, is a specification designed to make it easy to transfer access control rights between parties. It is part of a collection of standards for representing security elements in the JSON format. Following is the full set of related specifications:

For now, we’ll focus only on JWT.

Warning

The JWT series and the JSON Object Signing and Encryption (JOSE) specification is not the only way to handle access tokens. There are some similar specifications including Branca, PASETO, and Macaroons.

For our purposes here, I review the basics of token-based access control using JWT as the example. Your API traffic management platform should guide you through the details of implementing a secure, reliable, and scalable JSON-related token support process, and you should check out how your platform approaches token-based access control and how you can observe and manage token lifetimes.

The JWT

JWTs are compact ways to share data between parties. Each token has three distinct parts: header, body, and signature.

Header

The header holds preamble information—the information needed to understand the rest of the token. JWT headers have two fields. The typ field identifies the token type, and the alg field identifies the hashing algorithm used to encode and sign the token:

{
    "typ": "JWT",
    "alg": "HS256"
}

Body

The body holds the claims information—the data that is to be passed between parties. There are a series of predefined properties for JWT claims outlined in the specification. In the example that follows, the three predefined properties, called reserved claims, are iss (the issuer), iat (the expiration time), and sub (token subject). Then there is a series of private claim names. These are not standard and are understood only by the two parties sharing the message:

{
  "iss" : "bigco",
  "iat" : 1516239022 ,
  "sub" : "authorization",
  "http://users.bigco.org/" : "admin",
  "http://accounts.bigco.org/" : "user",
  "http://products.bigco.org/" : "guest",
  "http://identity.bigco.org" : "q1w2e3r4t5y6"
}

Signature

The signature holds the check-value—the information that you can use to verify that the body you received is actually the body that was originally sent. This signature string becomes part of the token sent between parties in the form of header.body.signature and is wrapped using the hashing algorithm identified in the header. Following is an example using the encoding string of "big-co-is-awesome" as the secret shared between both the sender and the receiver:

HMACSHA256(
  base64UrlEncode(header) + "." +
  base64UrlEncode(payload),
  "big-co-is-awesome"
)

The resulting hashed string would look something like this:

eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJqdWxlcyIsImlhdC
I6MTUxNjIzOTAyMiwic3ViamVjdCI6IkJpZ0NvIiwiaHR0cDovL3VzZXJzLmJpZ
2NvLm9yZy8iOiJhZG1pbiIsImh0dH6Ly9hY2NvdW50cy5iaWdjby5vcmcvIjoid
XNlciIsImh0dHA6Ly9wcm9kdWN0cy5iaWdjby5vcmcvIjoiZ3Vlc3QifQ.RnMdw
Bv3t4_tK5ZZy0nDQWQ7hKFtkGn5rmchZSDzuN8

It is important to keep in mind that the JWT specification is designed as a way to reliably encode messages that are shared between two parties to ensure data integrity and prevent tampering. This encoding, however, is not at the level of encryption. The contents are not encrypted in a way that secures the data from prying eyes (see “Encryption Considerations”).

Grants by Value

The most common way to ship authorization information via JWTs is to use the grants by value method—you load the JWT with all the grant information. The previous example (see “Body”) illustrates this approach. This approach works well for cases in which your API traffic platform knows ahead of time all of the services to which you’ll need access. For example, when you log in to a system, that platform is able to locate all of your access control grants and load them into a single JWT, which is then carried with the request as it runs through the system “talking” to services along the way. As the request reaches a service point (e.g., a function in the code), that end point can check the JWT for the appropriate grant property and act accordingly.

In monolith-style architectures, this usually works quite well. There is a fixed number of possible services (often just one big one) and a fixed number of possible access control grants for each user. When the system doing the authentication work (e.g., handling the user login) is the same system hosting all of the components, the access control data is usually easy to find and load for each user request.

Grants by Reference

In the grants by reference model, the JWT does not contain the actual access control information. Instead, it contains a single pointer to a claims store that contains the list of all possible grants for this authenticated identity. Following is an example of a JWT that relies on the grants by reference model:

{
  "iss" : "bigco",
  "iat" : 1516239022 ,
  "sub" : "authorization",
  "http://identity.bigco.org" : "q1w2e3r4t5y6",
  "http://claims.bigco.org" :
  "http://api.bigco.org/claims-store/"
}

In this example, you can see that instead of a series of rights claims, this token contains two private name claims. One represents the authenticated identity for this request (q1w2e3r4t5y6), and the other points to an endpoint for the claims store service (http://api.bigco.org/claims-store/). Now, when a request reaches a service endpoint (e.g., a microservice entry point or monolith function), that endpoint can use the supplied identity and some local information (e.g., the function name, action associated with the request, etc.) and send all that to the claims service identified by the http://claims.bigco.org property. The claims service can then evaluate the request and return a response indicating the actual access control rights this request should be granted.

The key advantage for this approach is that as the number of services and/or grants grows, the size of the JWT does not need to change. This is especially handy in a microservice-style architecture in which the number of grants is often changing and can be difficult to predict at runtime. It is also helpful when the party authenticating the request identity is not the same as the party managing access control rights.

There are, however, downsides to this approach. First, each individual service endpoint needs to take on the responsibility of making a call to the claims store for each request. This will add traffic to your internal system. Also, because rights-checking now includes another network call, your access control system is more vulnerable to system outages than when you use a grants by value approach. You can use cached responses and other techniques to address this problem. I talk more about dealing with the network in Chapter 5.

Summary

In this chapter, we covered security basics, including API keys, authentication, authorization, and encryption. We also covered the notion of authentication risk scoring and mediating identity between systems. Finally, we focused on using JWTs to document and manage access control via two approaches: the grants by value and grants by reference models.

Now that we have the basic elements of securing API traffic covered, we can look at the ways in which you can use your API traffic platform to improve the scalability and reliability of all your services. And that’s the topic of the next chapter.

Additional Reading

Get API Traffic Management 101 now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.