Roy Fielding coined the acronym REST in his PhD dissertation. Chapter 5 of Fielding’s dissertation lays out the guiding principles for what have come to be known as REST-style or RESTful web services. Fielding has an impressive résumé. He is, among other things, a principal author of the HTTP 1.1 specification and a cofounder of the Apache Software Foundation.
REST and SOAP are quite different. SOAP is a messaging protocol in which the messages
are XML documents, whereas REST is a style of software architecture for distributed
hypermedia systems, or systems in which text, graphics, audio, and
other media are stored across a network and interconnected through
hyperlinks. The World Wide Web is the obvious example of such a system. As
the focus here is on web services, the World Wide Web is the
distributed hypermedia system of interest. In the Web, HTTP is both a
transport protocol and a messaging system because HTTP requests and
responses are messages. The payloads of HTTP messages can be typed using
the MIME (Multipurpose Internet Mail Extension) type system. MIME has
types such as text/html
, application/octet-stream
, and audio/mpeg3
.
HTTP also provides response status codes to inform
the requester about whether a request succeeded and, if not, why. Table 1-1
lists some common status codes.
Table 1-1. Sample HTTP status codes and their meanings
Status code | In English | Meaning |
---|---|---|
200 | OK | Request OK |
303 | See Other | Redirect |
400 | Bad Request | Request malformed |
401 | Unauthorized | Authentication error |
403 | Forbidden | Request refused |
404 | Not Found | Resource not found |
405 | Method Not Allowed | Method not supported |
415 | Unsupported Media Type | Content type not recognized |
500 | Internal Server Error | Request processing failed |
REST stands for REpresentational State Transfer, which requires clarification because the central abstraction in REST—the resource—does not occur in the acronym. A resource in the RESTful sense is something that is accessible through HTTP because this thing has a name—URI (Uniform Resource Identifier). A URI has two subtypes: the familiar URL, which specifies a location, and the URN, which is a symbolic name but not a location. URIs are uniform because they must be structured in a certain way; there is a syntax for URIs. In summary, a URI is a standardized name for a resource and, in this sense, a URI acts as noun.
In practical terms, a resource is a web-accessible, informational item that may have hyperlinks to it. Hyperlinks use URIs to do the linking. Examples of resources are plentiful but likewise misleading in suggesting that resources must have something in common other than identifiability through URIs. The gross national product of Lithuania is a resource, as is the Modern Jazz Quartet. Ernie Banks’ baseball accomplishments count as a resource, as does the maximum flow algorithm. The concept of a resource is remarkably broad but, at the same time, impressively simple and precise.
As web-based informational items, resources are pointless unless
they have at least one representation. In the Web, representations are
MIME typed. The most common type of resource representation is probably
still text/html
, but nowadays resources tend to have
multiple representations. For example, there are various interlinked HTML
pages that represent the Modern Jazz Quartet but there are also audio and
audiovisual representations of this resource.
Resources have state. Ernie Banks’ baseball accomplishments changed during his career with the dismal Chicago Cubs from 1953 through 1971 and culminated in his 1977 induction into the Baseball Hall of Fame. A useful representation must capture a resource’s state. For example, the current HTML pages on Ernie at the Baseball Reference website need to represent all of his major league accomplishments, from his rookie year in 1953 through his induction into the Hall of Fame.
A RESTful request targets a resource, but the resource itself typically is created on the service machine and remains there. A resource may be persisted in a data store such as a database system. Some mix of humans and applications may maintain the state of the resource. In the usual case of web service access to a resource, the requester receives a representation of the resource if the request succeeds. It is the representation that transfers from the service machine to the requester machine. In a REST-style web service, a client does two things in an HTTP request:
One of the basic cases is a read request. If a read request succeeds, a
typed representation (for instance, text/html
) of the
resource is transferred from the server that hosts and maintains the resource to the
client that issues the request. The client is an arbitrary application written in
some language with support for REST-style requests. The representation returned from
the service is a good one only if
it captures the resource’s state in some appropriate way. Figure 1-5 depicts a resource
with its identifying URI together with a RESTful client and some typed
representations sent back to the client in response to client requests.
In summary, RESTful web services involve not just resources to represent but also client-invoked operations on such resources. At the core of the RESTful approach is the insight that HTTP, despite the occurrence of Transport in its name, acts as an API and not simply as a transport protocol. HTTP has its well-known verbs, officially known as methods. Table 1-2 lists the HTTP verbs that correspond to the CRUD (Create, Read, Update, Delete) operations so familiar throughout computing.
Table 1-2. HTTP verbs and their CRUD operations
HTTP verb | CRUD operation |
---|---|
POST | Create |
GET | Read |
PUT | Update |
DELETE | Delete |
Although HTTP is not case sensitive, the HTTP verbs are traditionally written in uppercase. There are additional verbs. For example, the verb HEAD is a variation on GET that requests only the HTTP headers that would be sent to fulfill a GET request.
HTTP also has standard response codes such as 404 to signal that the requested resource could not be found and 200 to signal that the request was handled successfully. In short, HTTP provides request verbs and MIME types for client requests and status codes (and MIME types) for service responses.:
Modern browsers generate only GET and POST requests. If a user enters a URL into the
browser’s input window, the browser generates a GET request. A
browser ordinarily generates a POST request for an HTML form with a submit
button. It goes against the spirit of REST to treat GET and POST interchangeably. In
Java, for example, an HttpServlet
instance has callback methods such
as doGet
and doPost
that handle GET
and POST requests, respectively. Each callback has the same parameter
types: the HttpServletRequest
type (the key/value pairs from the
request) and the type HttpServletResponse
(effectively a channel to communicate back to
the requester). It is not unknown for a programmer to have the two callbacks execute the same
code (for instance, by having one invoke the other), thereby conflating
the original HTTP distinction between read and
create. A key guiding principle of the RESTful style
is to respect the original meanings of the HTTP verbs. In particular, any
GET request should be side-effect free (idempotent) because a GET is a
read rather than a create,
update, or delete operation. A
GET as a read with no side effects is called a
safe GET.
The REST approach does not imply that either resources or the processing needed to generate adequate representations of them are simple. A REST-style web service might be every bit as subtle and complicated, in its functionality, as a SOAP-based service or a DOA application. The RESTful approach tries to simplify a service’s implementation by taking what HTTP and the MIME type system already offer: built-in CRUD operations, uniformly identifiable resources, typed representations that can capture a resource’s state, and status codes to summarize the outcome of a request. REST as a design philosophy tries to isolate application complexity at the endpoints—that is, at the client and at the service. A service may require lots of logic and computation to maintain resources and to generate adequate representation of resources, such as large and subtly formatted XML documents, and a client may require significant XML processing to extract the desired information from the XML representations transferred from the service to the client. Yet the RESTful approach keeps the complexity out of the transport level, as a resource representation is transferred to the client as the body of an HTTP response message. For the record, RESTful web services are Turing complete; that is, these services are equal in power to any computational system, including a system that consists of SOAP-based web services or DOA stubs and skeletons.
In HTTP a URI is meant to be opaque, which means that the URI:
http:
//bedrock/citizens/fred
has no inherent connection to the URI:
http:
//bedrock/citizens
although Fred happens to be a citizen of Bedrock. These are simply two different, independent identifiers. Of course, a good URI designer will come up with URIs that are suggestive about what they are meant to identify. The point is that URIs have no intrinsic hierarchical structure. URIs can and should be interpreted, but these interpretations are imposed on URIs, not inherent in them. Although URI syntax looks like the syntax used to navigate a hierarchical filesystem, this resemblance is misleading. A URI is an opaque identifier, a logically proper name that should denote exactly one resource.
Get Java Web Services: Up and Running, 2nd Edition now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.