Chapter 1. HTTP

HTTP stands for HyperText Transfer Protocol, and is the basis upon which the Web is built. Each HTTP transaction consists of a request and a response. The HTTP protocol itself is made up of many pieces: the URL at which the request was directed, the verb that was used, other headers and status codes, and of course, the body of the responses, which is what we usually see when we browse the Web in a browser.

When surfing the Web, ideally we experience a smooth journey between all the various places that we’d like to visit. However, this is in stark contrast to what is happening behind the scenes as we make that journey. As we go along, clicking on links or causing the browser to make requests for us, a series of little “steps” is taking place behind the scenes. Each step is made up of a request/response pair; the client (usually your browser or phone if you’re surfing the Web) makes a request to the server, and the server processes the request and sends the response back. At every step along the way, the client makes a request and the server sends the response.

As an example, point a browser to http://oreilly.com/ and you’ll see a page that looks something like Figure 1-1; either the information desired can be found on the page, or the hyperlinks on that page direct us to journey onward for it.

O’Reilly home page
Figure 1-1. O’Reilly home page

The web page arrives in the body of the HTTP response, but it tells only half of the story. The rest is elsewhere in the HTTP traffic. Consider the following examples.

Request header:

GET / HTTP/1.1
User-Agent: Mozilla/5.0 (X11; Linux i686) AppleWebKit/537.8 (KHTML, like Gecko) Chrome/23.0.1246.0 Safari/537.8
Host: oreilly.com
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Encoding: gzip,deflate,sdch
Accept-Language: en-GB,en-US;q=0.8,en;q=0.6

Response header:

HTTP/1.1 200 OK
Date: Thu, 15 Nov 2012 09:36:05 GMT
Server: Apache
Last-Modified: Thu, 15 Nov 2012 08:35:04 GMT
Accept-Ranges: bytes
Content-Length: 80554
Content-Type: text/html; charset=utf-8
Cache-Control: max-age=14400
Expires: Thu, 15 Nov 2012 13:36:05 GMT
Vary: Accept-Encoding

As you can see, there are plenty of other useful pieces of information being exchanged over HTTP that are not usually seen when using a browser. Understanding this separation between client and server, and the steps taken by the request and response pairs, is key to understanding HTTP and working with web services. Here’s an example of what happens when we head to Google in search of kittens:

  1. We make a request to http://www.google.com/ and the response contains a Location header and a 301 status code sending us to a regional search page; for me that’s http://www.google.co.uk/.
  2. The browser follows the redirect instruction (without confirmation from the user, browsers follow redirects by default) and makes a request to http://www.google.co.uk/ and recceives the page with the search box (for fun, view the source of this page. There’s a lot going on!). We fill in the box and hit search.
  3. We make a request to https://www.google.co.uk/search?q=kittens (plus a few other parameters) and get a response showing our search results.

In the story shown here, all the requests were made from the browser in response to a user’s actions, although some occur behind the scenes, such as following redirects or requesting additional assets. All the assets for a page, such as images, stylesheets, and so on are all fetched using separate requests that are handled by a server. Any content that is loaded asynchronously (by JavaScript, for example) also creates more requests. When we work with APIs, we get closer to the requests and make them in a more deliberate manner, but the mechanisms are the same as those we use to make very basic web pages. If you’re already making websites, then you already know all you need to make web services!

Clients and Servers

Earlier in this chapter we talked about a request and response between a client and a server. When we make websites with PHP, the PHP part is always the server. When using APIs, we build the server in PHP, but we can consume APIs from PHP as well. This is the point where things can get confusing. We can create either a client or a server in PHP, and requests and responses can be either incoming or outgoing—or both!

When we build a server, we follow patterns similar to the way that we build web pages. A request arrives, and we use PHP to figure out what was requested and craft the correct response. For example, if we built an API for customers so they could get updates on their orders programmatically, we would be building a server.

Using PHP to consume APIs means we are building a client. Our PHP application makes requests to external services over HTTP, and then uses the responses for its own purposes. An example of a client would be a page that fetches your most recent tweets and displays them.

It isn’t unusual for an application to be both a client and a server, as shown in Figure 1-2. An application that accepts a request, and then calls out to other services to gather the information it needs to produce the response, is acting as both a client and a server.

Warning

When working on applications like this, take care with how you name variables involving the word “request” to avoid confusion!

Web application acting as a server to the user, but also as a client to access other APIs
Figure 1-2. Web application acting as a server to the user, but also as a client to access other APIs

Making HTTP Requests

There are a few different ways to communicate over HTTP. In this section, three of them will be covered: Curl, tools in your browser, and PHP itself. The tool you choose depends entirely on your experience and on what it is that you’re trying to achieve. We’ll also look at tools for inspecting and debugging HTTP in Chapter 9.

The examples here use a site that is set up to log requests made to it, which is perfect for exploring how different API requests are seen by a server. To use it, visit the site and create a new “request bin.” You will see the URL needed to make requests to and be redirected to a page showing the history of requests made to the bin. Another excellent way to try making different kinds of requests is to use the reserved endpoints (http://example.com, http://example.net, and http://example.org) established by the Internet Assigned Numbers Authority.

Curl

Curl is a command-line tool available on all platforms. It allows us to make any web request imaginable in any form, repeat those requests, and observe in detail exactly what information is exchanged between client and server. In fact, Curl produced the example output at the beginning of this chapter. It is a brilliant, quick tool for inspecting what’s going on with a web request, particularly when dealing with those outside the usual scope of a browser.

In its most basic form, a Curl request can be made like this (replace the URLs with your own):

curl http://requestb.in/example

We can control every aspect of the request to send; some of the most commonly used features are outlined here and used throughout this book to illustrate and test the various APIs shown.

If you’ve built websites before, you’ll already know the difference between GET and POST requests from creating web forms. Changing between GET, POST, and other HTTP verbs using Curl is done with the -X switch, so a POST request can be specifically made by using the following:

curl -X POST http://requestb.in/example

To get more information from Curl than just the body response, there are a couple of useful switches. Try the -v switch since this will show everything: request headers, response headers, and response body in full! It splits the response up, though, sending the header information to STDERR and the body to STDOUT.

When the response is fairly large, it can be hard to find a particular piece of information while using Curl. To help with this, it is possible to combine Curl with other tools such as less or grep; however, Curl shows a progress output bar in normal operation, which is confusing to these other tools. To silence the progress bar, use the -s switch (but beware that it also silences Curl’s errors). It can be helpful to use -s in combination with -v to create output that you can send to a pager such as less in order to examine it in detail, using a command like this:

curl -s -v http://requestb.in/example 2>&1 | less

The extra 2>&1 is there to send the STDERR output to STDOUT so that you’ll see both headers and body; by default, only STDOUT would be visible to less.

Working with the Web in general, and APIs in particular, means working with data. Curl lets us do that in a few different ways. The simplest way is to send data along with a request in key/value pairs—exactly as when a form is submitted on the Web—which uses the -d switch. The switch is used as many times as there are fields to include:

curl -X POST http://requestb.in/example -d name="Lorna" -d email="lorna@example.com" -d message="this HTTP stuff is rather excellent"

APIs accept their data in different formats; sometimes the data cannot be POSTed as a form, but must be created in JSON or XML format, for example. In such instances, the entire body of a request can be assembled in a file and passed to Curl. Inspect the previous request, and you will see that the body of it is sent as:

name=Lorna&email=lorna@example.com&message=this HTTP stuff is excellent

Instead of sending the data as key/value pairs on the command line, it can be placed into a file called data.txt (for example). This file can then be supplied each time the request is made. This technique is especially useful for avoiding very long command lines when working with lots of fields, and when sending non-form data, such as JSON or XML. To use the contents of a file as the body of a request, we give the filename prepended with an @ symbol as a single -d switch to Curl:

curl -X POST http://requestb.in/example -d @data.txt

Working with the extended features of HTTP requires the ability to work with various headers. Curl allows sending of any desired header (this is why, from a security standpoint, the header can never be trusted!) by using the -H switch, followed by the full header to send. The command to set the Accept header to ask for an HTML response becomes:

curl -H "Accept: text/html" http://requestb.in/example

Before moving on from Curl to some other tools, let’s take a look at one more feature: how to handle cookies. Cookies will be covered in more detail in a later chapter, but for now it is just important to know that cookies are stored by the client and sent with requests, and that new cookies may be received with each response. Browsers send cookies with requests as default behavior, but in Curl we need to do this manually by asking Curl to store the cookies in a response and then use them on the next request. The file that stores the cookies is called the “cookie jar”; clearly, even HTTP geeks have a sense of humor.

To receive and store cookies from one request:

curl -c cookiejar.txt http://requestb.in/example

At this point, cookiejar.txt can be amended in any way you see fit (again, never trust information that came from outside the application!), and then sent to the server with the next request you make. To do this, use the -b switch and specify the file to find the cookies in:

curl -b cookiejar.txt http://requestb.in/example

To capture cookies and resend them with each request, use both -b and -c switches, referring to the same cookiejar file. This way, all incoming cookies are captured and sent to a file, and then sent back to the server on any subsequent request, behaving just as they do in a browser.

Browser Tools

All the newest versions of the modern browsers (Chrome, Firefox, Opera, Safari, Internet Explorer) have built-in tools or available plug-ins for helping to inspect the HTTP that’s being transferred, and for simple services you may find that your browser’s tools are an approachable way to work with an API. These tools vary between browsers and are constantly updating, but here are a few favorites to give you an idea.

In Firefox, this functionality is provided by the Developer Toolbar and various plug-ins. Many web developers are familiar with FireBug, which does have some helpful tools, but there is another tool that is built specifically to show you all the headers for all the requests made by your browser: LiveHTTPHeaders. Using this, we can observe full details of each request, as seen in Figure 1-3.

LiveHTTPHeaders showing HTTP details
Figure 1-3. LiveHTTPHeaders showing HTTP details

All browsers offer some way to inspect and change the cookies being used for requests to a particular site. In Chrome, for example, this functionality is offered by an extension called “Edit This Cookie,” and other similar extentions. This shows existing cookies and lets you edit and delete them—and also allows you to add new cookies. Take a look at the tools in your favorite browser and see the cookies sent by the sites you visit the most.

Sometimes, additional headers need to be added to a request, such as when sending authentication headers, or specific headers to indicate to the service that we want some extra debugging. Often, Curl is the right tool for this job, but it’s also possible to add the headers into your browser. Different browsers have different tools, but for Chrome try an extension called ModHeader, seen in Figure 1-4.

The ModHeader plug-in in Chrome
Figure 1-4. The ModHeader plug-in in Chrome

PHP

Unsurprisingly, there is more than one way to handle HTTP requests using PHP, and each of the frameworks will also offer their own additions. This section focuses on plain PHP and looks at three different ways to work with APIs: using the built-in Curl extension for PHP, using the pecl_http extension, and making HTTP calls using PHP’s stream handling.

Earlier in this chapter, we discussed a command-line tool called Curl (see Curl). PHP has its own wrappers for Curl, so we can use the same tool from within PHP. A simple GET request looks like this:

<?php

$url = "http://oreilly.com";
$ch = curl_init($url);

curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$result = curl_exec($ch);
curl_close($ch);

The previous example is the simplest form, setting the URL, making a request to its location (by default this is a GET request), and capturing the output. Notice the use of curl_setopt(); this function is used to set many different options on Curl handles and it has excellent and comprehensive documentation on http://php.net. In this example, it is used to set the CURLOPT_RETURNTRANSFER option to true, which causes Curl to return the results of the HTTP request rather than output them. In most cases, this option should be used to capture the response rather than letting PHP echo it as it happens.

We can use this extension to make all kinds of HTTP requests, including sending custom headers, sending body data, and using different verbs to make our request. Take a look at this example, which sends some form fields and a Content-Type header with the POST request:

<?php

$url = "http://requestb.in/example";
$data = array("name" => "Lorna", "email" => "lorna@example.com");

$ch = curl_init($url);
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, json_encode($data));

curl_setopt($ch, CURLOPT_HTTPHEADER,
    array('Content-Type: application/json')
);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$result = curl_exec($ch);
curl_close($ch);

Again, curl_setopt() is used to control the various aspects of the request we send. Here, a POST request is made by setting the CURLOPT_POST option to 1, and passing the data we want to send as an array to the CURLOPT_POSTFIELDS option. We also set a Content-Type header, which indicates to the server what format the body data is in; the various headers are covered in more detail in Chapter 3.

The PHP Curl extension isn’t the easiest interface to use, although it does have the advantage of being reliably available. A great alternative if you control your own platforms is to add the pecl_http extension from PECL. This offers a much more intuitive way of working and has both function and object-oriented interfaces. For example, here’s the previous example, this time using pecl_http:

<?php

$url = "http://requestb.in/example";

$data = array("name" => "Lorna", "email" => "lorna@example.com");

$request = new HTTPRequest($url, HTTP_METH_POST);
$request->setPostFields($data);
$request->setHeaders(array("Content-Type" => "application/json"));

$request->send();
$result = $request->getResponseBody();

This extension works more elegantly by creating an HTTPRequest object, and then working with the properties on that object, before calling its send() method. Once the request has been sent, the body of the response is fetched by calling the getResponseBody() method.

Finally, let’s look at one more way of making HTTP requests from PHP: using PHP’s stream-handling abilities with the file functions. In its simplest form, this means that, if allow_url_fopen is enabled (see the PHP manual), it is possible to make a GET request using file_get_contents():

<?php

$result = file_get_contents("http://oreilly.com");

We can take advantage of the fact that PHP can handle a variety of different protocols (HTTP, FTP, SSL, and more) and files using streams. The simple GET requests are easy, but what about something more complicated? Here is an example that makes the same POST request with headers, illustrating how to use various aspects of the streams functionality:

<?php

$url = "http://requestb.in/example";
$data = array("name" => "Lorna", "email" => "lorna@example.com");

$context = stream_context_create(array(
    'http' => array(
        'method' => 'POST',
        'header' => array('Accept: application/json',
            'Content-Type: application/x-www-form-urlencoded'),
        'content' => http_build_query($data)
    )
));

$result = file_get_contents($url, false, $context);

Options are set as part of the context that we create to dictate how the request should work. Then, when PHP opens the stream, it uses the information supplied to determine how to handle the stream correctly—including sending the given data and setting the correct headers.

As you can see, there are a few different options for dealing with HTTP, both from PHP and the command line, and you’ll see all of them used throughout this book. These approaches are all aimed at “vanilla” PHP, but if you’re working with a framework, it will likely offer some functionality along the same lines; all the frameworks will be wrapping one of these methods so it will be useful to have a good grasp of what is happening underneath the wrappings. After trying out the various examples, it’s common to pick one that you will work with more than the others; they can all do the job, so the one you pick is a result of both personal preference and which tools are available (or can be made available) on your platform.

Get PHP Web Services now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.