Designing Evolvable Web APIs with ASP.NET

Chapter 6. Media Type Selection and Design

A good contract lets friends stay friends.

It is common to hear developers explain how REST-based systems are better than alternatives for building Web APIs because they are simpler. The lack of contracts, such as WSDL (Web Service Description Language), is often cited as one of the reasons for the simplicity.

However, when building distributed systems, you cannot avoid coming to some kind of prearranged agreement between components. Without some form of shared knowledge, those components cannot interact in a meaningful way.

This chapter is about the types of contracts used in web architecture, the process of selecting the best contracts to meet our needs, and identifying where it’s necessary to create new contracts.

Self-Description

One of the key concepts around designing contracts is self-description. Ideally, a message should contain all the information the recipient needs to understand the intent of the sender, or at least provide references to where the necessary information can be found.

Imagine you receive a letter in the mail that has the numbers 43.03384,–71.07338 written on it. That letter provides you with all the data that you need to achieve a very specific goal, but you are missing all of the context required to do anything useful with it. If I were to tell you that the pair of numbers were latitude and longitude coordinates, then you would have some understanding of their meaning. Obviously, I am assuming that you either already understand that coordinate system or are capable of searching for information on how to use it. Self-descriptive does not mean that the message needs to actually include a description of the longitude/latitude system. It simply needs to reference it in some way.

Knowing that the information included in the letter is coordinates is only half the story. What are you supposed to do with those coordinates? More context is needed for you to know what you should do with that information. If you see that the letter’s return address is “Great Places To Drink, PO Box 2000, Nevada,” you have a hint that maybe the coordinates are the location of a recommended pub.

Types of Contracts

In the web world, media types are used to convey what a resource represents, and a link relation suggests why you should care about that resource. These are the contracts that we can use when building evolvable systems. These pieces of the architecture represent the shared knowledge. These are the parts of the system that we have to be careful to design well, because when they change, they could break components that depend on them.

Media Types

Media types are platform-independent types designed for communication between distributed systems. Media types carry information. How that information is represented is defined in a written specification.

Unfortunately, the potential for media types is woefully underused in the world of Web APIs. The vast majority of Web APIs limit their support to application/xml and application/json. These two media types have very little capability to carry meaningful semantics and often lead people to use out-of-band knowledge to interpret them. Out-of-band knowledge is the opposite of self-descriptive. To return to our letter example, if the “Great Places To Drink” company were to tell you ahead of your receiving the letter that the numbers written on your letter would be geographic coordinates, that would be considered out-of-band knowledge. The information needed to interpret the data is communicated in some way separate from the message itself. In the case of generic types like application/xml and application/json, it requires us to communicate the semantics of the message in some other way. Depending on how we do that, it can make evolving a system much more difficult because it requires communicating changes to the system in that out-of-band manner. When out-of-band knowledge is used, clients assume that the server will send certain content and therefore they are unable to automatically adapt if a server returns something different. The result is that server behavior becomes locked down by the existence of clients. This can become a major inhibitor of change.

Primitive Formats

This section includes examples that show how the same set of data is communicated through different media types.

The media type application/octet-stream, shown in Example 6-1, is about as basic a media type as you can get. It is simply a stream of bytes. User agents who receive this media type usually cannot do anything with the payload other than allow the user to save the bytes in a file. There are no application semantics defined at all.

Example 6-1. A stream of bytes

 GET /some-mystery-resource
 200 OK
 Content-Type: application/octet-stream
 Content-Length: 20

 00 3b 00 00 00 0d 00 01 00 11 00 1e 00 08 01 6d 00 03 FF FF

In Example 6-2, media type text/plain tells us that the content can be safely rendered directly to a end user, who will be able to read the data. In its current form, the example body does not provide any hints as to what the data is for; however, there is nothing to stop a server from including a paragraph of prose in the body with an explanation of the information.

Example 6-2. Human readable

 GET /some-mystery-resource
 200 OK
 Content-Type: text/plain
 Content-Length: 29

 59,0,13,1,17,30,8,365,3,65535

In the last 50 years of computing, we have played distributed ping-pong with the semantics of our business applications. In the days of the mainframe, the server knew everything about your data and the client was a terminal that knew nothing more than how to display characters and detect keypresses.

In the 80s and early 90s, with the rise of personal computers and local area networks, the clients became king. Shared data was still stored on a file server, but as far as the server was concerned it was just dealing with files, rows, columns, and indexes. All the intelligence was on the client.

Toward the end of the 90s client/server databases gained popularity due to the fact that PC-based networked applications were being stretched to the limits of their extremely chatty architecture, and client/server databases promised huge scalability improvements.

Client/server databases had limited success in driving rich client applications—not because of any technical problem but because many developers had insufficient training and tried to apply the techniques used for ISAM (indexed sequential access method) databases on client/server databases. This problem was exacerbated by vendors pushing client/server databases as a “drop-in replacement” to achieve scalability.

The new millennium saw the rise of the web application. Web applications worked well because they moved the application workflow and business logic onto a server that lives close to the data. This addressed the problem of chatty PC-based networks and avoided some of the concurrency problems that were tricky to handle using client/server databases.

In recent years, JavaScript has gone from augmenting HTML-based web experiences to creating and controlling web experiences. We are seeing a trend of moving application workflow and logic back onto the client but within the runtime environment of a web browser.

If we are to move logic back to the client, it is important we understand why this has failed in the past, so as to avoid repeating the mistakes of our predecessors.

Hypermedia-enabled media types are a very important part of this critical architectural decision because they are able to carry both workflow and application semantics across the wire and allow a more intelligent distribution of workload between client and server.

In Example 6-3, the media type text/csv provides some structure to the information being returned. The data model is defined as a set of comma-separated values that are then broken down into lines of (usually) structurally similar data. We still have no idea what the data is, but we could at least format the data for presentation to a user, assuming the user knows what she is looking at.

Example 6-3. Simply structured data

 GET /some-mystery-resource
 200 OK
 Content-Type: text/csv
 Content-Length: 29

 59,0
 13,1
 17,30
 8,365
 3,65535

Popular Formats

Consider Example 6-4.

Example 6-4. Markup

GET /some-mystery-resource
 200 OK
 Content-Type: application/xml
 Content-Length: 29

 <root>
        <element attribute1="59" attribute2="0"/>
        <element attribute1="13" attribute2="1"/>
        <element attribute1="17" attribute2="30"/>
        <element attribute1="8" attribute2="365"/>
        <element attribute1="3" attribute2="65535"/>
 </root>

In this particular case, returning the content as XML did not add any more semantics than the text/csv format did. We still just have five pairs of digits, but it does provide a place to name the pieces of data. However, the meaning of those names is not being defined in the specification for application/xml, so any client that tries to assign meaning to those names is depending on out-of-band knowledge and therefore introducing hidden coupling. Later in this chapter, we will discuss other ways to layer semantics on top of generic media types without creating hidden coupling.

For more complex scenarios, application/xml can be useful to represent hierarchies of data and allow blocks of text to be marked up with additional data. However, we still have the problem that application/xml provides limited ways to assign semantics.

As far as communicating semantics, application/json (shown in Example 6-5) has even less capability than application/xml. The advantage of consuming JSON within a web browser environment is that we can download JavaScript code that can apply semantics to the document. This allows clients and servers to evolve simultaneously, but it has the disadvantage of limiting clients to those that can support a JavaScript runtime. This also impacts the ability for intermediary components to interact with the message, thereby limiting the benefits of the HTTP layered architecture.

Example 6-5. Object serialization

GET /some-mystery-resource
 200 OK
 Content-Type: application/json
 Content-Length: 29

 { "objects" : [
        {"property1"="59", "property2"="0"},
        {"property1"="13", "property2"="1"},
        {"property1"="17", "property2"="30"},
        {"property1"="8", "property2"="365"},
        {"property1"="3", "property2"="65535"}
    ]
 }

If generic media types are at one end of a continuum of media types, then the next example is at the opposite end. In this case, we have defined a new media type that is specific to our particular application and has exactly the semantics that the server understands. To deliver this content, we need to write a specification for the media type, make it publicly available on the Internet, and preferably register the media type with IANA, so that it can easily be found by a developer who wishes to understand the meaning of the representation he just received.

New Formats

Now let’s consider Example 6-6.

Example 6-6. Service-specific format

 GET /some-mystery-resource
 200 OK
 Content-Type: application/vnd.acme.cache-stats+xml
 Content-Length: ??

 <cacheStats>
        <cacheMaxAge percent="59" daysLowerLimit="0" daysUpperLimit="0">
        <cacheMaxAge percent="13" daysLowerLimit="0" daysUpperLimit="1">
        <cacheMaxAge percent="17" daysLowerLimit="1" daysUpperLimit="30">
        <cacheMaxAge percent="8" daysLowerLimit="30" daysUpperLimit="365">
        <cacheMaxAge percent="3" daysLowerLimit="365" daysUpperLimit="65535">
 </cacheStats>

This media type finally conveys that the data we have been dealing with is the series of data points for a graph that shows the frequency distribution of the length of the max-age cache control header of requests on the Internet. This media type provides all the information a client needs for rendering a graph of this information. However, the applicability of this media type is fairly specific. How often is someone going to write an application that needs to render a graph of caching statistics? The idea of writing a specification and submitting that specification to IANA for registration seems like overkill. The vast majority of today’s Web APIs create these narrowly focused payloads but just don’t bother with the specification and registration part of the process. There are alternatives, though, that provide all the information needed by the client, and yet can be applicable to far more scenarios.

Consider the scenario in Example 6-7.

Example 6-7. Domain-specific format

 GET /some-mystery-resource
 200 OK
 Content-Type: application/data-series+xml
 Content-Length: ??

 <series        xAxisType="range"
                        yAxisType="percent"
                        title="% of requests with their max-age value in days">
        <dataPoint yValue="59" xLowerValue="0" xUpperValue="0">
        <dataPoint yValue="13" xLowerValue="0" xUpperValue="1">
        <dataPoint yValue="17" xLowerValue="1" xUpperValue="30">
        <dataPoint yValue="8" xLowerValue="30" xUpperValue="365">
        <dataPoint yValue="3" xLowerValue="365" xUpperValue="65535">
 </series>

In this case, we have created a media type whose purpose is to deliver a series of data points that can be used to plot a graph. It could be a line graph, a pie chart, a histogram, or even just a table of data. The client can understand the semantics of the data points from the perspective of drawing a graph. It doesn’t know what the graph is about; that is left for the human consumer to appreciate. However, the additional semantics allow the client to do things like overlay graphs, switch axes, and zoom to portions of the graph.

The reusability of this media type is vastly higher than that of the application/vnd.acme.cachestats+xml. Any application scenario where there is some data to be graphed could make use of this media type. The time and effort put into writing a specification to completely describe this format could quickly pay off.

It is my opinion that this kind of domain-specific, but not service-specific, media type is the optimal balance of semantics that media types should convey. There are a few examples of this sort of media type that have proven quite successful:

HTML was conceived as a way of communicating hyperlinked textual documents.
Atom was designed as a way to enable syndication of web-based blogs.
ActivityStream is a way of representing streams of events.
Json-home is designed to enable discovery of resources made available in an API.
Json-problem is a media type designed to provide details on errors returned from an API.

All of these media type examples have several things in common. They have semantics that are intended to solve a specific problem, but they are not specific to any particular application. Every API needs to return error information; most applications have areas where there is some stream of events. These media types are defined in a completely platform- and language-agnostic way, which means that they can be used by every developer in any type of application. There remain many opportunities for defining new media types that would be reusable across many applications.

Hypermedia Types

Hypermedia types are a class of media types that are usually text based and contain links to other resources. By providing links within representations, user agents are able to navigate from one representation to another based on their understanding of the meaning of the link.

Hypermedia types play a huge role in decoupling clients from servers. Through hypermedia, clients no longer need to have preexisting knowledge of resources exposed on the Web. Resources can be discovered at runtime.

Despite the obvious benefits of hypermedia in HTML for web applications, hypermedia has so far played a very minor role in the development of Web APIs. Web application developers have tended to avoid hypermedia due to lack of tooling, a perception that links create unnecessary bloat in the size of representations, and a general lack of appreciation of its benefits.

There are scenarios where the cost of hypermedia cannot be justified—for example, when performance is absolutely critical. For performance-critical systems, the HTTP protocol is probably not the best choice either. When evolvability is a key goal in an HTTP-based system, hypermedia cannot be ignored.

Media Type Explosion

So far we have seen how generic media types require out-of-band knowledge to provide semantics, and we have seen examples of more specific media types that carry domain semantics. Some members of the web development community are reluctant to encourage the creation of new media types. There is a fear of an “explosion of media types”—that the creation of a large number of new media types would produce badly designed specifications, duplicated efforts, and service-specific types, and any possibility of serendipitous reuse would be severely hindered. It is not an unfounded fear, but it’s likely that, just as with evolution, the strong would survive and the weak would have little impact.

Generic Media Types and Profiles

There is another approach to media types that is favored by some. Its basic premise is to use a more generic media type and use a secondary mechanism to layer semantics onto the representation.

One example of this is a media type called Resource Description Framework (RDF). To dramatically oversimplify, RDF is a media type that allows you to make statements about things using triples, where a triple consists of a subject, an object, and a predicate that describes the relationship between the subject and object. The significant portion of the semantics of an RDF representation are provided by standardized vocabularies of predicates that have documented meanings. The RDF specification provides the way to relate pieces of data together but does not actually define any domain semantics itself.

In Example 6-8, taken from the RDF Wikipedia entry, the URI http://purl.org./dc/elements/1.1 refers to a vocabulary defined by the Dublin Core Metadata Initiative.

Example 6-8. RDF Example

<rdf:RDF
  xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
  xmlns:dc="http://purl.org/dc/elements/1.1/">
        <rdf:Description rdf:about="http://en.wikipedia.org/wiki/Tony_Benn">
                <dc:title>Tony Benn</dc:title>
                <dc:publisher>Wikipedia</dc:publisher>
        </rdf:Description>
</rdf:RDF>

Another example of layering semantics is the use of application-level profile semantics (ALPS). ALPS is a method of specifying domain semantics that can then be applied to a base media type like XHTML, as shown in Example 6-9. The recently ratified link relation profile is a way of attaching these kinds of additional semantic specifications to existing media types.

Example 6-9. ALPS over XHTML

 GET /some-mystery-resource
 200 OK
 Content-Type: application/xhtml
 Content-Length: 29

 <html>
        <head>
        <link rel="profile" href="http://example.org/profiles/stats" />
    </head>
        <title>% of requests with their cache-control: max-age value in days </title>
        <body>
                <table class="data-series">
                        <thead>
                                <td>from</td>
                                <td>to (days)</td>
                                <td>percent</td>
                        </thead>
                        <tr class="data-point">
                                <td class="xLowerValue"></td>
                                <td class="xUpperValue">0</td>
                                <td class="yValue">59</td>
                        </tr>
                        <tr class="data-point">
                                <td class="xLowerValue">0</td>
                                <td class="xUpperValue">1</td>
                                <td class="yValue">13</td>
                        </tr>
                        <tr class="data-point">
                                <td class="xLowerValue">1</td>
                                <td class="xUpperValue">30</td>
                                <td class="yValue">17</td>
                        </tr>
                        <tr class="data-point">
                                <td class="xLowerValue">30</td>
                                <td class="xUpperValue">365</td>
                                <td class="yValue">8</td>
                        </tr>
                        <tr class="data-point">
                                <td class="xLowerValue">365</td>
                                <td class="xUpperValue"></td>
                                <td class="yValue">3</td>
                        </tr>
                </table>
        </body>
 </html>


GET http://example.org/profiles/stats
200 OK
Content-Type: application/alps+xml

<alps version="1.0">
    <doc format="text">
        Types to support the domain of statistical data
    </doc>

   <descriptor id="data-series" type="semantic">
        <descriptor id="data-point" type="semantic">
         <doc>A data point</doc>
                <descriptor id="xValue" type="semantic"">
                        <doc>X value on graph</doc>
                </descriptor>
                <descriptor id="xLowerValue" type="semantic">
                        <doc>Lower bound on X range of values</doc>
                </descriptor>
                <descriptor id="xUpperValue" type="semantic">
                        <doc>Upper bound on X range of values</doc>
                </descriptor>
                <descriptor id="yValue" type="semantic" >
                        <doc>Y value on graph</doc>
                </descriptor>
                <descriptor id="yLowerValue" type="semantic">
                        <doc>Lower bound on Y range of values</doc>
                </descriptor>
                <descriptor id="yUpperValue" type="semantic">
                        <doc>Upper bound on Y range of values</doc>
                </descriptor>
                </descriptor>
   </descriptor>


</alps>

The Hypermedia Application Language (HAL), demonstrated in Example 6-10, is a generic media type that uses link relations as a way to apply domain semantics.

Example 6-10. HAL in both application/hal+xml and application/hal+json

<resource       xAxisType="range"
                        yAxisType="percent"
                        title="% of requests with their max-age value in days">
        <resource       rel="http://example.org/stats/data-point"
                                yValue="59"
                                xLowerValue="0"
                                xUpperValue="0">
        <resource       rel="http://example.org/stats/data-point"
                                yValue="13"
                                xLowerValue="0"
                                xUpperValue="1">
        <resource       rel="http://example.org/stats/data-point"
                                yValue="17"
                                xLowerValue="1"
                                xUpperValue="30">
        <resource       rel="http://example.org/stats/data-point"
                                yValue="8"
                                xLowerValue="30"
                                xUpperValue="365">
        <resource       rel="http://example.org/stats/data-point"
                                yValue="3"
                                xLowerValue="365"
                                xUpperValue="65535">
 </resource>

 {
        "xAxisType" : "range",
        "yAxisType" : "percent",
        "title" : "% of requests with their max-age value in days",
        "_embedded" : {
                "http://example.org/stats/data-point" :
                { "yValue" : "59", "xLowerValue" : "0", "xUpperValue" : "0"},
                "http://example.org/stats/data-point" :
                { "yValue" : "13", "xLowerValue" : "0", "xUpperValue" : "1"},
                "http://example.org/stats/data-point" :
                { "yValue" : "17", "xLowerValue" : "1", "xUpperValue" : "30"},
                "http://example.org/stats/data-point" :
                { "yValue" : "8", "xLowerValue" : "30", "xUpperValue" : "365"},
                "http://example.org/stats/data-point" :
                { "yValue" : "3", "xLowerValue" : "365", "xUpperValue" : "65535"}

        }

 }

HAL relies on link relations to provide the semantics needed to interpret the non-HAL parts of the representation. This means that in the documentation for the link relation type http://example.org/stats/data-point, the meaning of yValue, xLowerValue, and xUpperValue needs to be defined. HAL doesn’t care whether these values are specified as attributes or elements; it is up to the consumer of the HAL document to discover where the information is stored.

One challenge with using the link relation to communicate semantics is that entry point URIs often do not have link relations. When you type a link into a browser address bar, there is no link relation. There are a couple of workarounds for this. You can keep the root resource limited to just embedded resources and links, or you can use a link with the link relation type to associate semantics to the root resource.

The advantage of using a generic media type is that tooling to generate, parse, and render those formats likely already exists and can be reused. It also means that you can define semantic profiles that have mappings to multiple different base media types. This can be advantageous when one particular media type is more suitable for a particular platform. When defining domain-specific media types, if it is desirable to support both XML and JSON variants, you must write two specifications because the format and semantics are both defined by the media type.

Efforts are under way to try to formalize the process of describing a semantic profile, and it is likely that there will be multiple viable approaches.

The disadvantage of using the generic media type combined with a secondary semantic profile is that the semantics of the message are less visible to intermediary components. Media types specified in the Content-Type header can easily be processed by any layer in the HTTP architecture. Using techniques like link relations and profiles to attach semantics makes it more difficult for intermediaries to discover that information and make processing decisions based on it. As is commonly the case in web architecture, there is not only one way to do things. There are advantages and disadvantages, and systems must be engineered to meet the requirements.

The use of a secondary semantic profile is definitely an interesting space to watch, and I look forward to a future where there are more prescriptive solutions to defining the semantics of a message.

From these examples, you can see there are many ways you can communicate the same data between the client and server. Some media types carry more semantics, some less. How much application-specific semantics you use to drive your client depends on the availability of existing standard media types that suit your needs and your tolerance for coupling.

Other Hypermedia Types

The last few years have seen a number of new hypermedia types introduced. The following sections summarize a few examples.

Collection+Json

The Collection+Json media type addresses the domain of lists of things. It is interesting in that it is generic and specific at the same time. It specifically supports only lists of things, but it does not care what the list of things is. It also has interesting semantic affordances that can describe how to query the list and add new items to it. Although it has minimal semantics for the items in the list, it does support the notion of profiles for describing the list items.

Siren

Siren is another fairly new hypermedia type that is similar to HAL in that it is effective in representing nested data structures and embedded resources. It differs from HAL in the way it attaches semantics to data elements. Siren borrows the notion of a class from HTML as a way to identify semantic information. It also makes a distinction between links that are for navigating between resources and those that represent behaviors. The action links also have their own style of link hints that describe to the client how to invoke the action.

Although some people argue that we should all just standardize on a single format, I would rather let natural selection take its course than try to force-fit a single hypermedia format into every scenario. HTTP makes it easy for APIs to support multiple representations of a resource, and clients can pick the one they prefer, so having multiple formats in use is not a major hindrance to progress.

Up to this point, we have talked primarily about media types as a way of communicating semantics, but as we hinted when discussing HAL, semantics can also be communicated via link relations. The next section will discuss this further.

Link Relation Types

In the introduction to this section on contract types, I noted that link relation types suggest why you might be interested in a particular resource.

Link relation types were first introduced in HTML. The most common link relation type is probably the rel="stylesheet" value, which connects an HTML page to the stylesheet used to help render it:

<link href="..." media="all" rel="stylesheet" type="text/css" />

More recently, there has been an effort to fully specify what a link relation type really is. This results of this effort can be found in RFC 5988.

Semantics

In the same way that media types can be very generic about the semantics they communicate, so can link relations. There are some standardized link relation types—like next, previous, current, item, collection, first, and last—that are very generic. However, they do provide important contextual information about the related resource. On the other hand, there are some standardized link relations that have very specific usages. Examples of these are help, monitor, payment, license, copyright, and terms-of-service.

Reviewing the standard registry of link relations only hints at the power of link relations. Most of the standard link relations do not define any behavior or constraints; they simply identify the target resource or describe the relationship between the context resource and the target resource.

If you spend any amount of time reading about the world of web and Internet specifications, you will quickly learn there are lots of politics involved in the process. I will warn you that I am biased in favor of the Internet Engineering Task Force (IETF) as the keeper of Internet-related specifications. The IETF defers to IANA for registries, which is why I refer to the IANA media type registry and IANA link relation registry. However, there are other organizations that are unhappy with the IETF/IANA registration procedures and have chosen to use an alternate registry for link relations.

A few link relations have begun to suggest we can do more with them than just identify a relation. Consider noreferrer and prefetch. noreferrer tells the user agent that if it follows the link, it should not specify the referer header. prefetch tells the user agent that it should retrieve the target resource prior to the user requesting to follow the link so that the representation will already be in the cache. In these cases, the link relation is actually instructing the user agent about how the server would like it to interact with the link. These instructions can go much further. A link relation type specification could indicate that only the POST method should be used with a particular link, or that when the link is followed, the returned representation will always be application/json.

Instead of defining interaction mechanisms in a specification, some people prefer to embed metadata into links to describe to the client how to interact with the link. For example, an HTML <FORM> tag has a method property that indicates to the browser whether to use a GET or POST method. A variation on this is the use of link hints, which allow a server to embed metadata that suggests how a link might be used. It does not preclude there being other valid ways of using the link.

All these approaches are valid ways to communicate link mechanics to a client. Using embedded metadata with a fairly general link relation is the most reusable approach but requires the most bytes over the wire. It also potentially reduces the number of link relations of which a client needs to be aware. Using a precise link relation with all the details of the interaction specified in the documentation is more bandwidth-friendly but puts more requirements on the client application.

If we take this idea of a precise link relation to an extreme, a link relation address could require the use of the GET method and return application/json with properties street, city, state, country, and zipcode. The more constrained the link relation the less reusable it becomes, and it is highly unlikely that such a link relation would be accepted by the subject matter experts who manage the registry. However, there is the notion of extended link relation types, where you use a URI as an identifier to uniquely identify the relation. When using an extended link relation type, you are not required to register the link relation with IANA, and you are free to do what makes sense to you.

The interesting effect of using a link relation to precisely describe the expected response is that it allows the use of generic types like application/xml and application/json to convey data without depending on out-of-band knowledge.

However, my personal experience has been that dividing the semantics between link relations and media types more evenly produces more reusable contract types.

Link relations and media types work together in a manner similar to how adjectives and nouns work in language. The adjective happy can be used in combination with many nouns. By combining the independent adjective and noun, we can avoid the need for an explosion of nouns such as happydog, happycow, happyfish, and so on.

{ "collection" :
  {
    "version" : "1.0",
    "href" : "http://example.org/journal/?fromDate=20130921&toDate=20130922"
    "items" : [
      {
        "href" : "http://example.org/transaction/794",
        "data" : [
          {"amount" : "14576", "currency" : "USD", "date" : "20130921"}
        ],
        "links" : [
          {"rel" : "origin", "href" : "http://examples.org/account/bank1000"},
          {"rel" : "destination",
           "href" : "http://examples.org/account/payables/HawaiiTravel"}
        ]
      },
    "items" : [
      {
        "href" : "http://example.org/transaction/794",
        "data" : [
          {"amount" : "150000", "currency" : "USD", "date" : "20130922"}
        ],
        "links" : [
          {"rel" : "origin",
           "href" : "http://examples.org/account/receivables/acme"},
          {"rel" : "destination",
           "href" : "http://examples.org/account/bank1000"}
        ]
      }
  }
}

As an example, imagine that we are to register two link relation types, origin and destination. There are many scenarios in which we need to represent where something has come from and where it is going—whether it is a file copy, a bank transfer, or a route on a map. The same link relations can be reused in many different scenarios. Sometimes the semantics of these reusable relations are sufficient to implement generic functionality on the client regardless of what the links may be pointing to. This is very similar to the way polymorphism works in object-oriented development.

Usually when people think of links in hypermedia documents, they think about defining relationships between domain concepts. This is the primary use of links in the field of linked data. However, it is possible to use links for much more than making declarations about the relationships in our domain.

Replacing Embedded Resources

In the world of HTML, we are used to creating links to images, scripts, and stylesheets; however, in APIs it is uncommon to see these kinds of static resources exposed. With careful use of client-side private caching, an API can efficiently expose static resources for all kinds of information that would normally be embedded into a client application.

Indirection Layer

Links are used as a way of providing a layer of indirection. By creating discovery documents at API entry points, we can enable clients to dynamically identify the location of certain resources without having to hardcode URIs into the client application (see Example 6-11).

Example 6-11. A GitHub discovery resource

GET https://api.github.com/
{
        "current_user_url":"https://api.github.com/user",
        "authorizations_url":"https://api.github.com/authorizations",
        "emails_url":"https://api.github.com/user/emails",
        "emojis_url":"https://api.github.com/emojis",
        "events_url":"https://api.github.com/events",
        ...
        "user_search_url":"https://api.github.com/legacy/user/search/{keyword}"
}

This layer of indirection allows a server to reorganize its URI structure without clients needing to make changes. Imagine that GitHub wanted to allow searching for users by location. It could make the following change:

"user_search_url":"https://api.github.com/legacy/user/search{?keyword,country}"

Assuming the client was following the rules of URI Template (RFC 6570) token replacement, there would be no client changes necessary to adopt this new URI format.

We can also use indirection to provide a type of intelligent load balancing. If certain resources are putting a disproportionate amount of load on a server, the URIs could be changed to point to alternative servers with additional capacity. This can be useful when different types of requests create very different types of workloads.

Indirection can also be useful for finding geolocated resources. Using client location based on IP address, response representations can contain links to servers that are geographically close. This can be very important because network latency can be a significant factor over long distances. Accessing a server on the East Coast of the US from a client on the West Coast will likely be on the order of 80 milliseconds. When there are numerous resources to be retrieved to render a user interface, this can quickly be very noticeable to an end user.

Reference Data

It is common in data entry user interfaces to provide the user a list of options to select from. These lists don’t need to be predefined in a client application, and in fact a client doesn’t even need to know in advance what list needs to be associated with a particular input field. By annotating an input field with a link to a list of options, a client application can generically identify the valid list of items without any knowledge of the input domain.

For example:

<InputForm>
  <Street>/<Street>
  <City></City>
  <Province domainUrl="http://api.example.org/lists/provinces&country=CAN"/>
  <Country>Canada</Country>
</InputForm>

HTML forms achieve a similar goal by embedding the entire list into the input element, but that is not a particularly efficient approach. Links can be used as a way to reduce payload size. Sometimes certain pieces of information are required less often and can be moved off into a distinct resource. You can often offset the extra cost of the second roundtrip with the savings from not retrieving the additional information when it is not needed.

A second reason for splitting resources is when the volatility of the data is very different. Retrieving a representation of sensor readings from a device that includes all the device configuration details is wasteful because the sensor readings would likely change far more often than the device configuration. Adding a link to the device configuration enables us to retrieve device configuration information only when needed.

The third reason for splitting resources is to enable reuse. This is the case in our address/province list example. The list of provinces is the same for any Canadian address. If a user is entering many addresses, then being able to use a client cached copy of the list can be very effective.

Workflow

Probably one of the most unique features of REST-based systems is the way that application workflow is defined and communicated. In RPC-based systems, the client must understand the application interaction protocol. For example, it must know that it has to call open before it calls send and close when it has finished. These rules have to be baked into the clients and any dynamic detection of state must be built ad hoc.

Links, embedded into representations, can be used to instruct clients of the valid interactions based on state. A client must still be aware of the types of interactions that exist in order to use them, but it no longer has the burden of knowing when it is allowed to make a certain type of request.

Consider Example 6-12, a version of the same scenario using hypermedia.

Example 6-12. Using hypermedia to define workflow

GET /deviceApi
200 OK
Content-Type:  application/hal+xml

<resource>
        <link rel="http://example.org/rels/open" href="/deviceApi/sessions"/>
</resource>

POST /deviceApi/sessions
Content-Length: 0

201 Created Session
Content-Type:  application/hal+xml
Location: http://example.org/deviceApi/session/1435

<resource>
  <link rel="http://example.org/rels/send" href="/deviceApi/session/1435{?message}"/>
  <link rel="http://example.org/rels/close" href="/deviceApi/session/1435"/>
</resource>

DELETE /deviceApi/session/1435
200 OK

The client still needs to understand the open/send/close link relations, but the server can guide the client through the process. The client must know that to activate the open link, it should send a POST with no body, and to activate the close link it must use the DELETE method. In this example the responses use HAL, but there is no reason the server could not return more than one hypermedia type. The link relations do not need to constrain the returned media type. It is necessary, however, that the client understand at least one of the media types returned by the server.

If a client is designed to understand that links may be dynamic, then it can adapt to changes in the workflow. For example, if it used an algorithm where it first looks for the send link, if it does not find one it looks for an open link, follows it, and looks once again for a send link. With this approach, if a later version of the API does not require the open/close dance, then the initial /deviceApi representation can be changed to immediately include the send. The client would automatically adapt to the new protocol and continue working without change.

This is an extremely simple example. More complex applications have more complex interaction protocols and more opportunities to take advantage of this dynamic workflow capability.

Syntax

RFC 5988 also specifies the format for embedding links into HTTP headers so that even with binary content like images and video, you can still include hypermedia with the returned representation. What is not specified, however, is how links should be embedded into other media types. The way links are serialized must be specified by the media type specification itself. This is another reason why using media types like application/json and application/xml can be problematic, as they don’t define how links should be represented. There have been a number of different conventions used, but without a hard spec it is difficult to write reusable code to parse links. It would seem like a trivial thing to define, but I’ve seen people debate for hours over whether a JSON object should be called links or _links.

Here are some examples of link syntax:

application/hal+json

 "_links": {
        "self": { "href": "/orders" },
        "next": { "href": "/orders?page=2" },
        "find": {
            "href": "/orders{?id}",
            "templated": true
        },
        "admin": [{
            "href": "/admins/2",
            "title": "Fred"
        }]
    },

application/collection+json
  "links" : [
          {"rel" : "blog", "href" : "http://examples.org/blogs/jdoe",
                "prompt" : "Blog"},
          {"rel" : "avatar", "href" : "http://examples.org/images/jdoe",
                "prompt" : "Avatar", "render" : "image"}
        ]

application/vnd.github.v3+json

"assignee": {
      "login": "octocat",
      "id": 1,
      "avatar_url": "https://github.com/images/error/octocat_happy.gif",
      "gravatar_id": "somehexcode",
      "url": "https://api.github.com/users/octocat"
    }

application/hal+xml
<link rel="admin" href="/admins/5" title="Kate" />

application/atom+xml
<link href="http://www.example.org/data/q1w2e3r4" rel="related" hreflang="en" />
<collection href="http://example.org/blog/main" />
<content src="http://www.example.org/blog-posts/123" />
<icon>http://www.example.org/images/icon</icon>


text/html
 <link  rel="stylesheet" type="text/css"
        href="http://cdn2.sstatic.net/stackoverflow/all.css?v=c9b143e6d693">

 <a href="/faq">faq</a>

<form id="search" action="/search" method="get" autocomplete="off">
        <div>
            <input      autocomplete="off" name="q" class="textbox"
                        placeholder="search" tabindex="1" type="text"
                        maxlength="240" size="28" value="">
        </div>
</form>

As you can see from these examples, links can take on many shapes in hypermedia representations. In the coming years, I hope we will see some more convergence on styles to remove some of the cosmetic differences. It is worth noting that although many of these examples do not have a rel attribute, they have the notion of a link relation type. In the case of HTML, an <a> tag could just as easily been represented as:

  <link rel="a" href="/faq"/>

The same argument can be made for representing a <FORM> tag as:

<link rel="form" id="search" action="/search" method="get" autocomplete="off">
        <div>
            <input      autocomplete="off" name="q" class="textbox"
                        placeholder="search" tabindex="1" type="text"
                        maxlength="240" size="28" value="">
        </div>
</link>

The two styles simply demonstrate two different ways of identifying the link relation semantics. They also introduce an interesting workaround to one problem some people have with link relations as defined by RFC 5988. In order to use a simple token like rel="destination", you must register this relationship with IANA, which means there will be a review by domain experts. The intent of this registry is to encourage the development of link relations that are suitable for use across many different media types. As mentioned earlier, you can use the notion of extended media types and create a link relation that uses a URI. However, URIs can be long and noisy in representations. If you want to create a link relation that is specific to your particular media type, then you can choose to use a serialization such as:

<family>
  <mother href="/people/bob"/>
  <father href="/people/mary"/>
</family>

By making the link relation type an integral part of the media type syntax, you are explicitly stating that the link relation is defined only within this media type and you can avoid the URI naming requirement of extended link relation types.

There is another interesting property of link relations. Links can have multiple relations assigned to them. For example:

<link rel="first previous" href="/foo" />
<link rel="nofollow noreferrer" href="/bar" />

Be aware that this capability is simply a serialization optimization; it does allow the behavior of the links to be combined. RFC 5988 says:

Relation types SHOULD NOT infer any additional semantics based upon the presence or absence of another link relation type, or its own cardinality of occurrence.

Semantically, there is no difference between the previous example and the following:

<link rel="first" href="/foo" />
<link rel="previous" href="/foo" />

<link rel="nofollow" href="/bar" />
<link rel="noreferrer" href="/bar" />

RFC 5988 also defines a number of other metadata properties of a link that can provide information to the user agent on how links should be processed. Instead of being specified in a written document, the instructions can be embedded in the representation of the link:

<link href="..." rel="related" title="More info...." hreflang="en"
      type="text/plain" >

These attributes are simply hints to the user agent; they do not guarantee the server will provide compliant representations.

A Perfect Combination

Link relation types and media types are the peanut butter and jelly of the distributed application world. Link relation types bind together resource representations to create a complete application that allows users to achieve their goals. These contract types work best when your semantics are evenly spread between them to encourage serendipitous reuse.

Designing a New Media Type Contract

When trying to identify the best media type to use for a particular resource, you should always first look for standard media types. Creating media can be challenging and sometimes standard media types may not be an exact fit, but they may be capable of carrying enough semantics to allow the client to achieve the users’ goals.

In the case where you determine that there is no existing media type or link relation that carries the needed semantics, then it may be worth considering creating one. When creating a media type, aim for the following characteristics:

The captured semantics could be used by more than one application.
The required syntax is minimal and makes an open world assumption. In other words, the absence of information does not make any statement about that information.
Unrecognized information should be ignored unless it specifically violates other rules.

Selecting a Format

Most of the time when developers think of selecting a format for their media type, they think of using XML or JSON as the base format. The advantage is that there is so much tooling available that can process these formats, and they provide a flexible structure on which to define additional semantics. Both formats have varying strengths and weaknesses, and in many cases it is simply a matter of preference. However, the types of clients that you wish to support do have an influence on the decision. If you expect that JavaScript clients will be the primary consumers of your media type, then JSON is the obvious choice. For systems that are integrating with large enterprise applications, XML may be more appropriate. Either way, as web developers we need to become comfortable with both formats and use whichever one makes the most sense.

However, I would caution against doing both. XML and JSON are quite different in their approach to data representation, and you risk creating a lowest-common-denominator format that doesn’t really take advantage of either format. If you really do feel you need to support both, then recognize that managing two different spec formats is double the work and the community using the media type will end up being fragmented. For generic formats like HAL, it may make sense to support both variants, but it is a decision that you should not take lightly: having two variants that work slightly differently may end up causing more confusion than it is worth to attract the larger audience that refuses to adopt the single chosen format.

Sometimes, neither XML nor JSON may be the right choice. I’ve seen numerous occasions where people have been using a JSON document for the purposes of updating a single property value. In some cases, a plain-text format representation is the simplest choice. All languages have libraries that allow converting from simple text into native data types. If all you want to do is transfer a simple value, then consider text/plain or some derivative of it as an option.

Tip

The use of text/plain format is a good example of why it is smart to keep metadata out of the body of your HTTP representation. Many APIs have taken this approach, including status codes and other metadata in the body of their responses. However, if you do this, you are limiting the media types that you can use and duplicating the intent of the HTTP headers. If you are forced to support clients that cannot access HTTP headers, then define special media types just for those clients. Try not to constrain your API for other, more capable clients.

Creative use of media types does not need to be limited to text-based types. In a blog post, Roy Fielding shows how to use a monochrome image format as a sparse array to identify resources that have changed in a single representation to avoid polling large numbers of them.

Enabling Hypermedia

As covered in Chapter 1, for media types that are not text based, the best option for hypermedia is to use link headers as defined in RFC 5988. For text-based formats, we discussed the variety of link syntaxes that are currently in use when covering link relations. However, there are a few other concerns that we need to address at the media type level. Should the media type allow two links with the same relation? If so, how can a user agent distinguish between those two links? In HAL, a link can have a name attribute associated with the link to identify it. HTML allows tags to have an id attribute associated for the purposes of identification.

Is there a need for hyperlinks to reference a particular element within an instance of the media type? Should a syntax be defined for identifying fragments?

What are the rules for resolving relative URIs?

Optional, Mandatory, Omitted, Applicable

When designing media types, especially ones used to represent writeable resources, I have found it necessary to convey semantics about the presence or absence of a particular piece of information. The most obvious scenario is that of a mandatory element. In the case of our Issue item media type, the title attribute is the one property that is mandatory.

For properties that are not mandatory, there are a number of reasons why a property may not be present in a representation. A resource may have chosen to include only a subset of properties in a representation for performance reasons, and therefore certain properties may be omitted. Another possibility is that the property is considered nonapplicable.

Applicability refers to when the relevance of one property is dependent on the value of another property of the resource. For example, in an employee record, there may be a TerminatedDate field. If the employee status is Current, then it is likely that the TerminatedDate field is not applicable. Database tables and classes don’t have the flexibility to change their shape dynamically per entity instance, so we often end up using null values to indicate that a property does not have any meaningful value. Unfortunately, a null value is also used to indicate that a property has not yet been supplied a value. This is not the same as a property being not applicable.

With media type representations, we can completely omit any syntax relating to the nonapplicable property and include the property syntax, but set the value as null or empty for those where a value has not yet been defined.

The policy of removing nonapplicable properties from representations can simplify client code and reduce coupling. There is often business logic that correlates the controlling property and the dependent properties. If the client assumes that the presence of a property infers applicability, then the client never needs to be aware of that business logic and can evolve without impacting the client.

Tip

The ability to distinguish between mandatory, applicable, and omitted properties is one example of how explicitly defined media types are more expressive than object serialization formats, as those formats are limited to the semantics that can be expressed by a class.

When defining representations that contain only a subset of the resource properties, you need to avoid ambiguity between information that is being omitted and information that’s nonapplicable. Attribute groups can sometimes help in these cases in the same way they allow partitioning of mandatory fields.

Embedded Versus External Metadata

Annotating representations is one way to include metadata such as a mandatory flag, type definitions, and range conditions. For example:

<foo>
    <fooDate required="true" type="Date" minValue="2001/01/01"
                 maxValue="2020/12/31">2010/04/12</fooDate>
</foo>

This approach makes it easy for the client to access the metadata because it will be parsed at the same time as the actual data. However, as the amount of metadata increases it can significantly increase the size of the representation, and usually the metadata changes far less frequently than the data does. Also, the same metadata can usually be reused for many resources that belong to a single resource class.

When a resource contains two distinct sets of data that have different lifetimes, often the best option is to break it into two resources and link the resource with the shorter lifetime to the resource with the longer lifetime. This allows caching layers to reduce the data transmitted across the network.

One challenging aspect of using external metadata is that we must correlate which pieces of metadata apply to which pieces of representation data. Some media types define a selection syntax that allows us to point to a specific fragment of data within a document. For example, CSS stylesheets use selectors, and XML-based representations can use XPath queries to identify nodes within a document.

Extensibility

Media types are the point of coupling between the client and the server. A breaking change to a media type specification will potentially break clients. Ensuring that media types are designed to be extensible will help to minimize breaking changes while accommodating changing requirements. Writing client code to handle extensible formats can be trickier, as you can’t make as many assumptions about the format. However, the up-front cost of writing more tolerant parsing code will quickly pay off.

One common tactic is to achieve extensibility to ignore unknown content. This may seem like counterintuitive advice to someone with lots of experience with schemas like XSD. However, this enables existing parsers to continue to process new versions of the media type that contain additional information. Returning to our minimal information model of an Issue, it could be represented in XML by:

<Issue>
   <Title>This is a bug</Title>
   <Description>Here are the details of the bug.</Description>
</Issue>

Assuming we wrote a parser that looked for the elements using the XPath queries /Issue/Title and /Issue/Description, then if the media type were enhanced to allow:

<Issue>
   <Title>This is a bug</Title>
   <FoundBy href='http://issueapi.example.org/user/342'/>
   <Description>Here are the details of the bug.</Description>
</Issue>

the existing parser would still be able to process this document even if there were missing information. What happens to that extra information is very much dependent on the use case. In some scenarios, it can be safely ignored. In others, the user should be warned about the fact that some information cannot be processed. Some services may choose to refuse to accept additional information that is not understood. All these are valid options, but there is no need to constrain the media type to support only one of the scenarios.

Another constraint that is often applied by XSD schemas is the ordering of elements. Unless the order of properties has some semantic significance, there is no need for a media type specification to enforce the order. I can imagine there are some simplicity and performance benefits for the parsing logic when properties appear in a specific order; however, once you allow unknown properties, those benefits are minimal. Facilitating extensibility is as much about avoiding unnecessary constraints as it is about applying constraints.

A media type specification should try to limit itself to constraints defined by the domain and not be limited by the implementation constraints of the service. For example, when a service is backed by a database, it is common to define field lengths. Field lengths are a constraint of the database used by the service; other implementations that use the media type will likely have different physical constraints. Arbitrarily enforcing a lowest-common-denominator constraint due to current implementation limitations is both unnecessary and unwise.

There is an interesting ongoing debate about JSON as a standard. Most JSON implementations limit the range of a numeric value in a JSON document to the same limit as those defined by JavaScript (64-bit floating-point value). However, Douglas Crockford, the author of JSON, has argued that JSON should exist independent of JavaScript and that there should be no constraint on the numeric value that can be represented in a JSON document. This is a forward-looking perspective that acknowledges the fact that JSON will most likely outlast the current JavaScript implementations. Admittedly, this decision makes life more difficult for the parser implementors, but I believe it is a worthwhile price to pay.

As a general rule of thumb, when it comes to defining a constraint in a media type, I ask myself if it is possible to parse the representation without the constraint and still convey the same meaning. If so, then I drop the constraint. The fewer the rules, the more likely it is for any extension to the media type to be harmless to existing code.

Registering the Media Type

In order for this “distributed type system” of media types to work, there needs to be a way for people to discover what valid types exist. The IANA registry is that central location. However, we have to be honest and say the current state of the IANA media type registry is pretty dismal. Currently, it consists of a few web pages with a bunch of links. Many of those links point to little more than an email message from 20 years ago. However, the fact that these entries still exist highlights the permanence of deploying types on the Internet. Once a new type has been let out onto the Internet, there is no guaranteed way of removing it. This is another reason why versioning media types can be problematic. There is no easy way to say “don’t use that version anymore.”

Many of the IANA registries have been updated to a newer XML/XSLT/XHTML format that makes them easier to consume by crawlers. However, the media type registry has not yet had this makeover and remains a pain to harvest.

The registration process is also fairly barbaric. It suggests you read six different RFCs and then asks you to submit an HTML form. It is recommended that before you submit the application form, you publish the proposed specification on the Internet and send an announcement of intent to submit to the IETF Types mailing list. The experts who moderate this list will likely provide feedback to assist in any perceived problems with the proposal. Be aware that these experts provide this guidance for free and are not there to educate people on media type design. When participating in these lists, try not to misinterpret bluntness and brevity for hostility!

There definitely is a need for some community pressure on IANA to improve the media type registry and its registration process. This is a critical piece of Internet infrastructure, and yet many visitors get the impression it is obsolete and abandoned because of its unkempt appearance.

Designing New Link Relations

After searching the link relation registry and failing to find a link relation that describes the type of resource that you want to link to, you have the option to create your own link relation type. There are three distinct paths that you could take for this:

Define the specifications for a new standard link relation type and submit that for approval.
Create an “extended” link relation type for your own purposes.
Integrate the link specification into your media type specification if you are already creating one.

Standard Link Relations

Creating a standard link relation has the benefit of your being able to use a short name for the rel value. The effort required to specify and register a link relation type does not have to be significant. The specification for the license link relation is a good example to review.

It is interesting to note that the RFC 4946 specification discusses only the use of the license relation within the context of Atom documents. In the IANA link relation registry, there is also a pointer that refers to a discussion in the HTML specification of its use within HTML documents. Another similar example comes from reading the specification on the monitor link relation, which implies it is for use only with the SIP protocol. It is unfortunate that these specifications are suggesting that their use is tied to particular media types. I’m quite sure that we need to be able to assign licenses to more than just HTML and Atom feeds, and I know that SIP is not the only way to monitor the status of resources.

One challenge of working in any complex discipline is knowing when you must follow the rules and knowing when you can break them. In these scenarios, I believe that these link relations have value beyond the context within which they have been defined and I am prepared to use them in other scenarios. I hope that as more people adopt the use of link relations in new scenarios, more awareness will follow about their reusability and new specifications will avoid tying them to specific media types.

The guidance for creating new link relations is very similar to that for media types. You want the relation to be as generic as possible while still providing sufficient semantics to address a particular domain issue. Some examples that are currently not standardized link relations but would be good candidates are:

Owner: A link to a resource that is responsible for the current resource.
Home: A link to an API entry point or root resource.
Like: An unsafe link to indicate a user’s appreciation of a resource.
Favorite: An unsafe link to request that the resource be stored as a favorite of a user.

The Microformats website documents many other proposed link relations and ones discovered in use in the wild. One interesting example is sitemap, which is widely used on the Web. In its specification, it prescribes the exact formats of the responses that are expected. This is an example of putting all the semantics in the link relation and none in the media type.

Extension Link Relations

Extension link relations are defined in RFC 5988 and allow you to create link relations that are not registered with IANA. To avoid naming conflicts, you must make the relation name a URI. This enables the use of the domain naming system to ensure uniqueness. Unfortunately, using URIs for link relations can become large and ugly within a representation. You can use CURIEs to abbreviate link relations, but some people don’t like them because they look like XML namespaces but don’t behave in the same way.

Having this capability is extremely useful, but it does lead to the possibility of link relation abuse. Once developers realize the power of link relations, they tend to go overboard and start creating service-specific link relations. Although this will work in the short term, it is not the best choice for the system’s evolution or the overall web ecosystem due to the service-specific coupling that it introduces.

Embedded Link Relations

If the link relation is closely related to the semantics of a media type, it may make sense to make the link relation part of the media type specification and valid only within the media type itself. The HTML <FORM> tag is an example of a link relation that is defined within the media type itself. However, it is unfortunate that it was defined this way because there are currently efforts to duplicate very similar functionality in all the other hypermedia media types. If <FORM> had been defined as an independent link relation, it would make it easier to reuse in other media types.

Registering the Link Relation

The process for registering link relations is fairly straightforward and is documented fully in RFC 5988.

Media Types in the Issue Tracking Domain

In Chapter 4, we identified several different categories of resources: list resources, item resources, discovery resources, and search resources. For each of these resource categories, we need to identify which media types we believe are the most appropriate to carry the required semantics.

Some developers tend to try and pick a single media type and reuse it across an entire API. There is a perception that delivering just one media type will reduce the effort of the client developer. In many cases, it does exactly the opposite; trying to package all the semantics of a nontrivial API into a single media type means either that the specification is going to be complex, or some semantics are going to be communicated out of band. Limiting the client to processing only a single media type becomes problematic when the API starts to integrate links with external systems. If the client is designed to process only the dedicated API media type, then it may be difficult to build in support for other media types from other APIs.

Building clients that can easily process many different media types encourages serendipitous reuse and facilitates system evolution and integration.

List Resources

For resources that return a list of items, we will be using the media type called collection+json. This is a hypermedia-enabled type designed explicitly to support lists of items. This media type supports associating an arbitrary set of data with each item in the list. It includes queries to enable searching for various subsets of items as well as a template property to facilitate creating a new item in the list.

We could have used HAL or even XHTML, as both are capable of representing a list of items; however, as collection+json is specifically designed for the purpose of representing lists, it seems a more natural fit. Example 6-13 demonstrates how collection+json can be used to represent a list of issues.

Example 6-13. Sample issue list

{
  "collection": {
    "href": "http://localhost:8080/Issue",
    "links": [],
    "items": [
      {
        "href": "http://localhost:8080/issue/1",
        "data": [
          {
            "name": "Description",
            "value": "This is an issue"
          },
          {
            "name": "Status",
            "value": "Open"
          },
          {
            "name": "Title",
            "value": "An issue"
          }
        ],
        "links": [
          {
            "rel": "http://webapibook.net/rels#issue-processor",
            "href": "http://localhost:8080/issueprocessor/1?action=transition"
          },
          {
            "rel": "http://webapibook.net/rels#issue-processor",
            "href": "http://localhost:8080/issueprocessor/1?action=close"
          }
        ]
      },
      {
        "href": "http://localhost:8080/issue/2",
        "data": [
          {
            "name": "Description",
            "value": "This is a another issue"
          },
          {
            "name": "Status",
            "value": "Closed"
          },
          {
            "name": "Title",
            "value": "Another Issue"
          }
        ],
        "links": [
          {
            "rel": "http://webapibook.net/rels#issue-processor",
            "href": "http://localhost:8080/issueprocessor/2?action=transition"
          },
          {
            "rel": "http://webapibook.net/rels#issue-processor",
            "href": "http://localhost:8080/issueprocessor/2?action=open"
          }
        ]
      }
    ],
    "queries": [
      {
        "rel": "http://webapibook.net/rels#search",
        "href": "/issue",
        "prompt": "Issue search",
        "data": [
          {
            "name": "SearchText",
            "prompt": "Text to match against Title and Description"
          }
        ]
      }
    ],
    "template": {
      "data": []
    }
  }
}

Item Resources

We have several options for representing each individual issue. We could use HAL and define a link relation issue that specifies the content. We could use XHTML and define a semantic profile that annotates the HTML with semantics from the issue tracking domain. Or we could define a new media type to represent an issue.

The notion of an issue is sufficiently generic that it could easily be reused by many services, and therefore it justifies the creation of a new media type. This is not a niche domain; it is one used by every software developer and many customer support call centers. Having an interoperable format, even if implementation variations prevent a full-fidelity communication, has the potential to be extremely valuable.

A sample representation of this media type is shown in Example 6-14. For the moment, this media type will be defined as JSON. Early adopters of web technology are more likely to be comfortable with JSON, and if the media type gains traction, then an XML variant will be defined to enable wider adoption.

The full specification for this media type can be found in Appendix E.

Example 6-14. Sample issue

{
  "id": "1",
  "title": "An issue",
  "description": "This is an issue",
  "status": "Open",
  "Links": [
    {
      "rel": "self",
      "href": "http://localhost:8080/issue/1"
    },
    {
      "rel": "http://webapibook.net/rels#issue-processor",
      "href": "http://localhost:8080/issueprocessor/1?action=transition",
      "action": "transition"
    },
    {
      "rel": "http://webapibook.net/rels#issue-processor",
      "href": "http://localhost:8080/issueprocessor/1?action=close",
      "action": "close"
    }
  ]
}

Discovery Resource

The discovery resource is an entry point resource that points to other resources that are available in the system. For this resource, we will be using a recently proposed media type called json-home. This media type is designed specifically to provide a representation for an entry point resource that allows dynamic discovery of resources. It is similar to the Atom Service Document but not limited to pointing to Atom feeds. The json-home document can have links to any arbitrary resource and can contain additional metadata that can be used to discover how to activate those links. Example 6-15 shows a possible json-home document for the Issue Tracker API.

Example 6-15. Sample root resource

{
  "resources": {
    "http://webapibook.net/rels#issue": {
      "href": "/issue/{id}",
      "hints": {
        "allow": [
          "GET"
        ],
        "formats": {
          "application/json": {},
          "application/vnd.issue+json": {}
        }
      }
    },
    "http://webapibook.net/rels#issues": {
      "href": "/issue",
      "hints": {
        "allow": [
          "GET"
        ],
        "formats": {
          "application/json": {},
          "application/vnd.collection+json": {}
        }
      }
    },
    "http://webapibook.net/rels#issue-processor": {
      "href": "/issueprocessor/{id}{?action}",
      "hints": {
        "allow": [
          "POST"
        ]
      }
    }
  }
}

Search Resource

For searching we will likely be able to rely on the query capability of collection+json. Where this proves insufficient, we will try to use the link relation search and the protocol defined by OpenSearch.

Conclusion

Media types and link relations are the tools used to manage the coupling between the components in your distributed application. This chapter has covered the different ways to use that coupling to communicate application semantics. Being aware of existing specifications, and how and when to create new ones, provides a solid foundation on which to actually start building an API. In the next chapter, we begin to write a sample API based on the knowledge we have gained.

Get Designing Evolvable Web APIs with ASP.NET now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial