Chapter 4. The Laws of Digital Identity

Solving the problems of digital identity discussed in the last chapter requires building something more abstract and general than the one-off, context-specific identity systems (the ones that give you an “ID”) that we find on the internet today. Almost every identity system you use is administrative, meaning it was created for the operator’s own administrative purposes. Being administrative, every one of these is different from every other one—giving you as many “IDs” as there are administrative systems to deal with. The world had a similar situation with networks in the 1980s, and we solved it with a metasystem, a system of other systems, that unified and transcended all the world’s separate, independent, and exclusive networks. This metasystem is called the internet.

The internet is a monument to abstraction and generality. Rather than being a communications system, like the telephone, the internet is a communications metasystem: that is, a system for building communications systems. Using encapsulating protocols—TCP and IP—that give everybody and everything a single and simple way to communicate across all those separate networks, the internet provides a unifying structure. This allows anyone to create whatever system they need by defining protocols. These new protocols, riding on top of TCP/IP, may be proprietary or open, special- or general-purpose. Every new protocol adds a new kind of message that the internet can communicate, changing and enriching its nature. Yet because they are built on a common protocol, they can serve a specific niche without sacrificing the underlying interoperability or modularity.

Similarly, an identity layer for the internet must also be a metasystem. An identity metasystem is a collection of interoperable identity systems. The identity metasystem provides the necessary building blocks and protocols for anyone to build an identity system meeting their specific needs that is interoperable with other identity systems similarly built. An identity metasystem is a prerequisite for an online world where identity is as natural as it is in the physical world. As we’ll see in later chapters, an identity metasystem removes friction, decreases cognitive overload, and makes online interactions more private and secure.

An Identity Metasystem

In 2005, Kim Cameron, Microsoft’s chief identity architect, published “The Laws of Identity”, a landmark paper laying out seven important principles for how digital identity should work. Cameron describes an identity metasystem that can provide the missing identity layer:

Different identity systems must exist in a metasystem. It implies we need a simple encapsulating protocol (a way of agreeing on and transporting things). We also need a way to surface information through a unified user experience that allows individuals and organizations to select appropriate identity providers and features as they go about their daily activities. The universal identity metasystem must not be another monolith. It must be polycentric (federation implies this) and also polymorphic (existing in different forms). This will allow the identity ecology to emerge, evolve and self-organize.

From Cameron’s description, we can identify six important features:

Encapsulating protocol
Protocols describe the rules for a set of interactions. Protocols are the foundation of interoperability and allow for scale. By defining how interactions happen, they mitigate the proximity problem. An encapsulating protocol allows other protocols to be defined on top of it. For example, the Internet Protocol (IP) is a protocol that encapsulates the User Datagram Protocol (UDP) and the Transmission Control Protocol (TCP). Thus, the encapsulating protocol enables a flexible set of interactions that can be adapted for specific contexts and needs and take place at a distance.
Unified user experience
Part of the beauty of the tacit nature of identity in the physical world is that we don’t have to switch apps or learn an entirely new way of interacting for each context. Traditionally, digital identity systems have not offered this kind of consistency. A unified user experience doesn’t mean a single user interface. Rather the focus is on the experience. Unified user experiences let people know what to expect. As a result, they can intuitively understand how to interact in any given context. Unified user experiences increase user autonomy, increase privacy, and support consent since users better understand the interaction.
User choice
By allowing people to select appropriate service providers and features, a metasystem allows for autonomy, anonymity, and flexibility. No single system can anticipate all the scenarios that come up as people live their lives. A metasystem allows context-specific scenarios to be built and can even support ad hoc interactions that no one anticipated.
Modular
An identity metasystem can’t be a single, centralized system with limited pieces and parts. Rather, the metasystem will have interchangeable components that are built and operated by various parties. Protocols and standards enable this. Modularity is a prerequisite for substitutability and choice, as is interoperability.
Polycentric (decentralized)
An identity metasystem is decentralized to enable autonomy and flexibility and to support better privacy. No single system can anticipate all the various relationships. And no single actor should be allowed to determine who uses the system or for what purposes. Furthermore, decentralization gives the metasystem the ability to scale as needed.
Polymorphic (different data schema)
The information human beings and systems need to recognize, remember, and react to various people, organizations, places, and things is context-dependent and varies widely from one situation to the next. The content carried by an identity metasystem must be flexible to support these varied interactions and support user autonomy.

The internet is full of identity systems for specific contexts, designed to administer identity for a specific service or application. Over the past two decades, developers, security researchers, and identity experts have made numerous attempts to share identity information between different contexts. The success of these efforts has been limited to federating identity data between close partners or simple authentication efforts (such as Google Sign-in). None has developed into a unifying metasystem with these properties, because they can’t. Cameron’s Laws of Identity help us explore why that is so and give us important design principles to follow in building an identity metasystem.

The Laws of Identity

The Laws of Identity describe seven objectives that an identity system must meet to function as a metasystem. A system meeting these seven laws can be widely accepted and used in many contexts. Each law gives rise to architectural principles that guide the construction of the metasystem. Understanding the laws can help eliminate a lot of bad designs before architects and engineers waste too much time on them. They can also be used to evaluate real-world identity systems. Chapter 22 will discuss the concept of legitimacy and how the Laws of Identity, along with governance and policy for an identity system, provide a basis for how broadly the system is adopted.

In writing about these, Cameron uses the word law in the scientific sense of a hypothesis about the world resulting from observation that can be tested.1 Testing the laws involves using them to evaluate the successes and failures of identity systems. They are not propositions (proven from first principles) or axioms (self-evident truths). They are not legal or moral laws. Neither are they philosophical.

The sections that follow discuss the seven laws individually. Each starts with a statement of the law in italics, as defined in Cameron’s paper.

Minimal Disclosure for a Constrained Use

The solution which discloses the least amount of identifying information and best limits its use is the most stable long-term solution.

—Kim Cameron, “The Laws of Identity”

Hardly a week goes by without news of a data loss at a major organization that threatens the identity information of hundreds of thousands, even millions, of people. The bad news is that data breaches are a fact of life. We’ll likely never get rid of them completely. The good news is that the amount and sensitivity of the data lost is something we can control.

Organizations often overcollect data about people and store it on the premise that it might be needed at some point. Even the data they do need could be reduced with an identity metasystem that makes the “just in time” transfer of data less onerous.

An identity metasystem can’t stop organizations from overcollecting data or storing it beyond when it’s needed. That’s an organizational policy and, increasingly, a regulatory issue. But the metasystem can support minimal disclosure and make it simple to get data when it’s needed, making it easier to argue for collecting less data.

Different types of information can be more identifying or less identifying. For example, a Social Security number (SSN) is more identifying than a one-off, unique identifier. Less identifying information is the least likely to identify an individual across several contexts. This information need not be just a single, highly correlatable identifier like an SSN but could be a collection of information that, taken together, can be used to identify an individual.

A good example is a system that needs to know a person’s age. Traditionally such systems ask the person for their birthday, but they really only need to know the age. A birthday, when combined with information like zip code and gender, is more likely to be uniquely identifying than an age. The identity metasystem can make answering questions even less identifying by supporting the ability to answer questions like “Is this person over 21?” instead of “What is this person’s age?” Asking questions in the form of predicates on attributes can significantly reduce the amount of information that is disclosed and how identifying that information is. There are orders of magnitude more people who are over 21 than there are who were born on any specific day.

Justifiable Parties

Digital identity systems must be designed so the disclosure of identifying information is limited to parties having a necessary and justifiable place in a given identity relationship.

—Kim Cameron, “The Laws of Identity”

Digital identity matters because people, organizations, and things need to have digital relationships with other people, organizations, and things. Clearly, everyone who is party to the relationship has a justifiable reason to know some things about others in the relationship. But not everything.

The law of minimal disclosure says that any information shared should be just what is needed and nothing more. The law of justifiable parties says these disclosures should be made only to entities who have a need to know. For example, suppose a dozen people are planning a party for Bob. If Alice needs to know Bob’s age, minimal disclosure says she shouldn’t ask him for his birthday. Justifiable parties says she should ask on a direct message channel, not in the group chat. Similarly, identity systems should be built so that only the parties who are involved in the transaction and have a need to know see the data that the system transmits.

With this law in mind, consider social login. When Alice logs into Bravo Corp’s site using Google, Facebook, Apple, or some other service, she visits the website for Bravo Corp, the relying party (RP), and is redirected to, say, Google, the identity provider (IdP), where she logs in. (For a refresher on these terms, see Chapter 2. For more on social login, see Chapter 13.) The IdP sends a cryptographic token back to the RP, indicating Alice has provided the right credentials; it might also send back other identity data they have about Alice.

Do social logins break the law of justifiable parties? You could argue that the IdP is a justifiable party, since its authentication service is needed to complete the transaction. Clearly, Alice and the RP have consented to this arrangement, so we might consider that evidence that they see the IdP as a justifiable party.

Remember that one of the purposes of these laws is to inform identity system architectures and help us analyze where they might be effective and where they might be exploited, cause harm, or fail. In the social login scenario, the IdP doesn’t just see that Alice is logging into Bravo Corp: it also knows about everyone who logs into Bravo Corp using its login service. They see many other places Alice logs into. Bravo Corp also learns something about Alice they don’t necessarily need to know: that she has an account at Google. As a result, many people avoid social login and continue to use usernames and passwords where they can. They don’t want the social login companies surveilling them.

Still, given its popularity, it would seem that social login has succeeded. But its use is far from universal. Regulated financial services companies, for example, do not use social login. I don’t know all the reasons they might not want or be able to use it, but the foundational reason is that they don’t consider the social login companies to be justifiable parties to the relationship they want to have with their customers.

An important implication of justifiable parties is that identity systems should make participants aware of the parties in any identity exchange. Social login does that—clearly—by redirecting the subject to the IdP. Some federated identity systems do not. Instead, they transfer information about the subject behind the scenes. This is sometimes called the “phone home” problem because the RP connects to the IdP directly. Meaningful user control requires the transparency implicit in the law of justifiable parties.

You also can’t talk about justifiable parties without discussing ad networks—which, like almost everything online, are based on identity systems. At the heart of the ad network identity system is the cookie, a simple correlation identifier built into the HTTP protocol. A correlation identifier is a unique string that can be used to link requests. HTTP cookies are generated by the server and stored on the browser. Whenever the browser makes a request to the server, it sends back the cookie, allowing the server to correlate all requests from that browser. (Chapter 11 discusses cookies and correlation in greater depth.)

Consider how (simple) ad tracking works. When you see an ad on Acme Corp’s website, it’s being served from a server owned by an ad company that Acme Corp has an agreement with. The ad server plants a cookie in your browser. Now you visit Bravo Corp’s website, which includes ads from the same ad server. Your browser dutifully reports the ad server cookie back to the ad server along with the information that the ad was on Bravo’s website. The company running the ad server now knows you were on both sites (along with lots of other metadata). Rather than correlating requests on a single site, they are using cookies to correlate your activity across the web.

To get a feel for how pervasive ad tracking is, I recommend spending a few minutes with the Fou Analytics Page X-Ray. Page X-Ray follows the cookies and trackers in a page to tell you more about how you’re being tracked. If you x-ray wired.com, for example, you’ll see a massive amount of data sharing by trackers, cascading and fanning out across at least five layers consisting of hundreds of other parties—nearly all of which are involved in showing ads to that visitor. All these parties likely believe that their involvement is justifiable and involves minimal disclosure for a constrained use, but many people would disagree and are increasingly concerned with the impact ad networks have on online privacy.

Directed Identity

A universal identity system must support both “omni-directional” identifiers for use by public entities and “unidirectional” identifiers for use by private entities, thus facilitating discovery while preventing unnecessary release of correlation handles.

—Kim Cameron, “The Laws of Identity”

Identity systems depend on identifiers (discussed further in Chapter 10). Identifiers take many different forms, but the law of directed identity categorizes them into two types: omnidirectional and unidirectional. More commonly, we call these public identifiers and peer or private identifiers, respectively.

The value of a public identifier is that it is easily resolvable by anyone.2 Public identifiers should be invariant and well known. In fact, their permanence is a feature. Public identifiers are designed to make it easy to discover information about the entity to which the identifier is bound.

URLs are the most common form of public identifier. They are based on DNS domain names, another common type of public identifier. Phone numbers and email addresses, alas, are also public identifiers. They too are relatively permanent, and most people like that they’re invariant because of the huge hassle of informing all your contacts when they change.

On the other hand, the value of a peer identifier is that it is not public. Like any identifier, it still must be resolved to be used, but that resolution happens using some nonpublic system or method. Peer identifiers should not be reassignable, to avoid the confusion that can occur if they are reused, but they needn’t be permanent. Many will be ephemeral.

A username (if it’s not an email address) is an example of a peer identifier. You use it with a single site. Nothing requires that you use the same one everywhere; with a good password manager you could have a different username everywhere you go online. Of course, many sites will only let you use an email for a username.

Peer identifiers have significantly better privacy protection than public identifiers, since they don’t leak correlatable information in every transaction. Many of the biggest privacy problems the world faces are rooted in universal identifiers like SSNs, phone numbers, national ID numbers, and so on.

Using public-key infrastructure (PKI) certificates to secure web connections is another example of an identity system with public identifiers (further discussed in Chapter 9). PKI certificates link an identifier to a public key. In this case, the identifier is the domain name in the certificate. Used in this way, certificates have proven to be a very successful identity system for organizations and websites.

On the other hand, PKI-based certificates for people have been a miserable failure. The person possessing the private key can use the PKI certificate to authenticate at websites or log into remote machines. Early web standards envisioned using certificates for authenticating at websites. The Netscape browser and web server supported this functionality. The expense of getting PKI certificates, which was quite high at the time, likely contributed significantly to the failure of this effort. But there was also significant concern over the privacy implications of people having one permanent identifier that they used all over the web.

Browser cookies are an interesting case. As we saw above, ad networks exploit cookies to surveil people as they use the web. An unintended consequence of the way HTTP cross-domain references and cookies work is that while they were intended to be peer identifiers, they ended up serving as public identifiers.

Cryptographic identifiers, like public keys, can function as peer identifiers if a new key pair is created for every relationship. This might seem daunting, but software can manage the keys, and recent developments in decentralized identifiers make management of large numbers of peer cryptographic identifiers easier. This makes directed identity much more feasible. Chapter 9 discusses public-private keys in detail and Chapter 15 discusses cryptographic identifiers.

Identifiers seem simple at first, but implementing them correctly can be difficult, as the case of cookies shows. Identity design can have a big impact on the usability, privacy, and flexibility of an identity system. An identity metasystem must support both public (omnidirectional) and peer (unidirectional) identifiers. In short, the law of directed identity tells us that the identity metasystem can’t use a single, universal identifier.

Pluralism of Operators and Technologies

A universal identity system must channel and enable the inter-working of multiple identity technologies run by multiple identity providers.

—Kim Cameron, “The Laws of Identity”

The law of pluralism of operators and technologies tells us we need more than one identity system. The world is full of identity systems, each built for a specific context and purpose: Cameron refers to this as the “identity ecology.”

At first this law might seem inconsistent with other laws, especially the law of consistent experience across contexts, which will be introduced later in this chapter. If users must have control and a consistent experience, irrespective of the identity context, doesn’t that imply a ubiquitous and pervasive system? Resolving this dilemma requires that you understand the relationship between the identity metasystem and the identity systems built on it.

Recall that a metasystem has an encapsulating protocol upon which other protocols can be built. Further, the metasystem is decentralized and polymorphic, meaning it can carry different kinds of data. The identity metasystem provides a stable, universal base for building identity systems. Because it satisfies the Laws of Identity, the systems built on top of it satisfy them as well.

Passports, driver’s licenses, national ID cards, employee badges, business licenses, credit cards, and professional licenses are all unique identity systems designed for a specific context to achieve a specific purpose. Thinking you could design just one system to replace all of these with some universal identity system would be ridiculous. But they all achieve user control and a consistent user experience because they use an underlying metasystem of sorts—the way credentials work in the physical world.

But we can go further than that. A movie ticket is an identity system. No, it doesn’t identify who you are, but it does identify what you are: one of N people allowed to occupy a seat in a specific theater at a specific time. In this view, any venue ticketing system is an identity system. So are prescriptions, invoices, receipts, and systems for titling cars and land. Each is designed to identify someone or something and convey some right or record some transaction. And all of them use a common, underlying pattern to provide a consistent experience and user control.

Most organizations, even small businesses, design and deploy identity systems—even if they don’t recognize that’s what they’re doing. Most of them are not digital. But as the internet comes to mediate more and more of our lives, many of them will be. An identity metasystem must support them all.

Human Integration

The universal identity metasystem must define the human user to be a component of the distributed system integrated through unambiguous human-machine communication mechanisms offering protection against identity attacks.

—Kim Cameron, “The Laws of Identity”

When I started the Internet Identity Workshop with Kaliya Hamlin and Doc Searls in 2005, the theme we chose was “user-centric identity.” This was a shift; in the preceding years, identity discourse had primarily focused on enterprises and their internal needs, as organizations of all stripes felt the need to build identity solutions for their specific contexts. The term user-centric indicated a design philosophy that would swing the pendulum back in the other direction, integrating people and their needs with the identity systems they used.

Phishing attacks, fraud, complexity, and friction are the results of not considering how humans participate in an identity solution. Take phishing, for example. In a phishing attack, the intruder poses as a legitimate organization, application, or website to steal authentication factors such as usernames and passwords (see Chapter 11). Phishing doesn’t attack the technical infrastructure of the identity system; it attacks the people using it. Phishing can happen over email, voice, Short Message Service (SMS), page hijacking, and even calendars. Quick response (QR) codes are sometimes used in phishing attacks where a legitimate QR code was simply stickered over with a fraudulent one.

QR code phishing is a good example of a common phishing technique: link manipulation. The manipulated links might be in web pages, emails, or SMS messages, but the idea is to make it look legitimate to fool the target into clicking on it. The new link usually leads to a page designed to look like the real thing but with some nefarious intent, like stealing a password, credit card numbers, or other personal information. Another kind of attack is social engineering, where phishers trick targets into thinking they need to take some action like revealing a password, passing on an access code, or even transferring funds.

Bad links, fake web pages, and con artists may not seem like the stuff of identity, but they happen because designs for identity systems often end at the computer screen and ignore the human component. The law of human integration says that designers need to extend their designs to consider how, when, and where people use identity.

As an example of where human-integrated design can mitigate this problem, consider web authentication. The usernames and passwords used for authentication are a primary attack vector in phishing. Web authentication reestablishes a session between the user’s browser and the site over and over again. This constant need to reestablish sessions is confusing; most people view it as a complex process standing between them and what they want to do. That makes it a weak point that phishers can exploit. Identity systems could be designed to counter this by creating a mutually authenticated connection that is difficult for attackers to intercept. This takes a tricky and error-prone task away from human users and replaces it with a task that is easier to understand.

Human integration requires profoundly changing how people experience identity systems, making those systems predictable and unambiguous enough to support informed decisions. In short, the design of identity systems must take people into account to provide good experiences.

Consistent Experience Across Contexts

The  unifying  identity  metasystem  must  guarantee  its  users  a  simple,  consistent  experience while enabling separation of contexts through multiple operators and technologies.

—Kim Cameron, “The Laws of Identity”

The law of consistent experience across contexts says that people’s experiences should be consistent from context to context. Providing a consistent user experience makes up for the digital world’s lack of tacit knowledge by allowing people to build up routines and muscle memory in its place. Designing a great user experience for one identity context is insufficient if you’re then using a good but completely different one for a different identity context.

One of the most familiar examples of consistent user experience is the automobile. My grandfather, who died in 1955, could get in a 2022-model car and safely drive it with only a little instruction. The user experience (not just the interface) for a car is largely the same as it was 70 years ago. There are other examples, including email, the windowed user interface, and even the venerable QWERTY keyboard.

One of the underappreciated features of web browsers is the consistent user experience that they provide. Tabs, address bars, the back button, reloading, and other features are largely the same regardless of which browser you use. There’s a reason why “Don’t break the back button!” has been common advice for web designers over the years. People depend on the web’s consistent user experience.

Alas, apps have changed all that by freeing developers from the strictures of the web. No doubt there have been some excellent uses of this freedom, but what we’ve lost is consistency in core user experiences. That’s unfortunate. Moreover, the web—and the internet, for that matter—has never had a consistent user experience for authentication. (At least not one that has caught on.) Consequently, the user experience is very fragmented.

Anyone familiar with the modern world of websites and applications knows the subtle frustration of performing a slightly different authentication ceremony for each site or application you use. The username and password input boxes are in different places, perhaps behind a “Log In” button. The password box might not appear until the username is input. The rules around acceptable password length and characters can be maddeningly complex. The site might use multifactor authentication (MFA), but there’s no consistency there: my phone has five MFA apps installed that I use regularly, in addition to sites that use SMS or email for MFA. And this is just for authentication.

You may not have thought about it as an identity-system design issue before, but anytime a website or application asks for information like your personal profile information, addresses, or even credit card information, you are transferring attributes—identity data. Each of these interactions is different for every website and application. Even different applications from the same company often do it differently. Password managers have taken some of the sting out of these problems, but they are still frustrating. Worse, inconsistent user experience is the source of much of the fraud that is rife online.

The law of consistent experience across contexts is closely tied to the other laws and the metasystem, which must provide a single way for people to establish safe channels with other people, organizations, and things. The metasystem’s encapsulating protocol provides a consistent method for requesting, selecting, and proffering identity information. Even though millions of individual identity systems might be built on top of the identity metasystem, the user experience in each is consistent because the metasystem is responsible for establishing safe channels where any kind of identity information can be exchanged.

There’s a saying in security: “Don’t roll your own crypto.” I think we need a corollary in identity: “Don’t roll your own interface.” A consistent user experience helps ensure that consent is unambiguous and that the user knows which parties are participating in the exchange.

Fixing the Problems of Identity

An identity metasystem provides three primary capabilities that allow it to be used as the basis for building context-specific identity systems:

Relationships
The metasystem provides a means for people, organizations, and things to have relationships with each other. These relationships are mutually authenticated, secure, and as private as possible for the use case.
Secure messaging
The metasystem supports secure messaging between the parties to support relationships and allow them to confidently conduct identity transactions.
Trustworthy claim exchange
The metasystem provides the means for parties that have relationships in the metasystem to use messaging to exchange polymorphic claims (messages about attributes) reliably, confidently, and securely.

Appropriately designed, a metasystem with these properties can conform to the seven laws and ensure that any identity system built on it does as well. An identity metasystem with the properties described above and designed to be consistent with the Laws of Identity provides the means to fix the problems of identity described in Chapter 3.

Let’s look at how:

Proximity
Secure claim exchange over a mutually authenticated channel provides digital relationships that mitigate the problems caused when connections are distant.
Autonomy
A metasystem that conforms to the laws of user control, minimal disclosure, and justifiable parties gives participants autonomy by establishing boundaries and allowing each participant to create and manage secure relationships with other participants in the metasystem and ensures that the data is shared by choice.
Flexibility
A metasystem that is decentralized, polymorphic, and modular ensures that people and organizations can use the metasystem to build whatever context-specific identity system they need.
Consent
A metasystem that conforms to the laws of user control and consent, justifiable parties, human integration, and consistent experience ensures that people unambiguously know what they are sharing and with whom.
Privacy
A metasystem that provides secure, mutually authenticated relationships and conforms to the laws of minimal disclosure and directed identity provides the means for reducing correlation across contexts and minimizing the amount of data that is shared.
Anonymity
A metasystem that supports trustworthy claim exchange and conforms to the law of minimal disclosure and directed identity can create ephemeral relationships and share and needed data without revealing who is participating in a permanent way.
Interoperability
A metasystem with an encapsulating protocol and conforming to the laws of pluralism of operators and technologies and consistent experience across contexts allows people to share claims outside of a specific use case. Identity systems built on the metasystem interoperate through consistent technology and user experience.
Scale
A metasystem that is decentralized and built on an encapsulating protocol scales by supporting millions of identity systems for different contexts and allowing anyone to build the identity system they need without sacrificing security, privacy, or user experience.

The coming chapters will explore concepts that lie at the core of digital identity, technologies that underpin the implementation of identity systems, and architectures that conform to the laws of identity. Along the way, I’ll also discuss existing identity protocols, standards, and systems and evaluate them with respect to the laws.

1 He also sometimes joked that by calling them “laws” he could outmaneuver lawyers and risk managers, who had an inherent respect for something called a law.

2 Resolve may seem like a funny word to use with identifiers. I’m using it as a general term because what you do with any given identifier depends a great deal on the context in which it is used.

Get Learning Digital Identity now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.