Chapter 4. Cookies
The HTTP protocol is stateless. This means that every request made must include all the information needed in order for the web server to serve the correct response. At least, in theory! In practice, that isn’t how we experience the Web as users. As we browse around a shopping site, the website “remembers” which products we already viewed and which we placed in our basket—we experience our journeys on the Web as connected experiences.
So how does this work? Additional information is being saved and sent with our web requests through the use of cookies. Cookies are just key/value pairs; simple variables that can be stored on the client and sent back to us with future requests. A user’s choice of theme or accessibility settings could be stored, or a cookie could be dropped to record something as simple as whether the user has visited the site before, or dismissed a particular alert message that was shown.
Cookie Mechanics
This isn’t the moment where I tell you how to bake cookies, although the instructions do read a little bit like a recipe. What happens when we work with cookies goes something like this (see Figure 4-1):
- A request arrives from the client, without cookies.
- Send the response, including cookie(s).
- The next request arrives. Since cookies were already sent, they will be sent back to us in these later requests.
- Send the next response, also with cookies (either changed or unchanged).
- Steps 3–4 are repeated indefinitely.
The main thing to remember is that, for a first visit from a new client (or someone who clears their cookies), there will be no cookies, so it is not possible to rely on them being present. This is easy to miss in testing unless you consciously make the effort to also test the case in which a user arrives without cookies; by default, your browser will keep sending the cookies.
Another thing to note is that cookies are only sent back with subsequent requests by convention; not all clients will do this automatically. Once a cookie is received by a client, even if it isn’t sent again in any later responses, most clients will send that cookie with each and every subsequent request. The most important thing to remember about cookies is that you cannot trust the data. When a cookie is sent to a client, it will be stored in plain text on that computer or device. Users can edit cookies as they please, or add and remove cookies, very easily. This makes incoming cookie data about as trustworthy as data that arrives on the URL with a GET
request.
To put that a little more plainly: do not trust cookie data.
How do users edit their data? Well, there are a couple of options. First of all, let’s look at using cookies with Curl. We can capture cookies into a “cookie jar” by using the -c
switch. Take a look at what a well known site like amazon.com sets for a new visitor:
curl -c cookies.txt http://www.amazon.com/
The cookie jar file that was saved will look something like this:
# Netscape HTTP Cookie File # http://curl.haxx.se/rfc/cookie_spec.html # This file was generated by libcurl! Edit at your own risk. .amazon.com TRUE / FALSE 1355305311 skin noskin .amazon.com TRUE / FALSE 2082787201 session-id-time 2082787201l .amazon.com TRUE / FALSE 2082787201 session-id 000-0000000-0000000
The format here contains the following elements:
- Domain the cookie is valid for
- Whether it is valid for all machines on this domain (usually TRUE)
- Path within the domain that this cookie is valid for
- Whether this cookie is only to be sent over a secure connection
- When this cookie will expire
- Name of the cookie
- Value of the cookie
Note the phrase “Edit at your own risk,” which translates to developers as “Edit, and interesting things may happen.” Whether working with a browser or Curl, it is possible to change these values wherever the cookies are stored, and they will be sent back to the server with a later request. With Curl, change the -c
switch to a -b
switch to send the cookies back with a request (use them both together to also capture incoming ones back into the file).
In the browser, your options will vary depending on which browser you use, but all of the modern browsers have developer tools either built in or available via a plug-in that enables you to see and to change the cookies that are being sent, as was mentioned in Browser Tools.
Working with Cookies in PHP
Cookies are key/value pairs, as I’ve mentioned, that are sent to the browser along with some other information, such as which paths the cookie is valid for and when it expires. Since PHP is designed to solve “the Web problem,” it has some great features for working with cookies. To set a cookie, use a function helpfully called setcookie()
:
<?
php
setcookie
(
"visited"
,
true
);
We can use this approach to show a welcome message to a visitor when he first comes to the site—because without any previous cookies, he won’t have the “visited” cookie set. Once he has received one response from this server, his “visited” cookie will be seen on future requests. In PHP, cookies that arrived with a request can be found in the $_COOKIE
superglobal variable. It is an array containing the keys and values of the cookies that were sent with the request.
When working with APIs, the same facilities are available to us. When PHP is a server, the techniques of using setcookie
and checking for values in $_COOKIE
are all that are needed, exactly like when we are working with a standard web application. When consuming external services in PHP, it is possible to send cookie headers with our requests in the usual way.
There’s some nice support for sending cookies in PHP’s curl
extension, which has a specific flag for setting cookies rather than just adding headers. With PHP’s curl
extension, it is possible to do something like this:
<?
php
$url
=
"http://requestb.in/example"
;
$ch
=
curl_init
(
$url
);
curl_setopt
(
$ch
,
CURLOPT_COOKIE
,
"visited=true"
);
curl_setopt
(
$ch
,
CURLOPT_RETURNTRANSFER
,
true
);
$result
=
curl_exec
(
$ch
);
curl_close
(
$ch
);
A selection of other options can be set using cookies, as seen when we discussed capturing them into the cookie jar in the code examples in Cookie Mechanics. The expiry date is probably the most-used setting. The expiry information is used to let the client know how long this cookie is valid for. After this time, the cookie will expire and not be sent with any later requests. This relies on the client and server clocks being vaguely in sync, which is often not the case. Having exactly matching clocks is rare, and in some cases clients can have their clocks set incorrectly by a number of years, so beware.
The expiry can be set in the past to delete a cookie that is no longer needed. If an expiry has not been set for a cookie, it becomes a “session cookie,” which means that it is valid until the user closes the browser. This is why you should always close your browser in an Internet cafe, even after logging out of your accounts.
Don’t confuse the “session cookie” with the cookies PHP uses to track user sessions within a web application. You can use traditional PHP sessions in a web service, but it is unusual to do so—usually API requests are more self-contained and stateless than their web equivalents.
Get PHP Web Services now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.