Twitter API: Up and Running

Chapter 4. Meet the Twitter API

kmakice The ingredients for your Twitter application are found in the methods of the API. This chapter is your shopping cart.

Now that you know what Twitter is all about (Chapter 1) and have the basic skills to play in the sandbox (Chapter 3), it’s time to introduce you to the building blocks for your future application. This chapter describes the specific request methods available through the Twitter API.

The section of the Twitter website that talks about the API groups the methods based on their server paths. This may be a bit confusing, for a few reasons. Some of the terminology is old and doesn’t fit with the way we talk about the service today. It is also very techie language, with words like “destroy” instead of “remove” or even “delete.” To help you get started, in this chapter I’ll drop the tech talk and reorganize the methods into groups reflecting how you might actually use them.

In each section, I’ll present one of the 40 existing API methods and explain what is needed to get data from Twitter using that method. Before we start talking about parameters and data formats, though, you need to understand how to connect to the API.

Note

Chapters 6 through 8 provide a description of a suite of web applications used to illustrate how everything goes together. However, sometimes the best way to understand how something works is to play with the input and output. To help with this, the sample code for this book includes a /test directory containing web forms that interface with the Twitter API. You can use these test pages to see the XML that is returned by Twitter. The sample code can be downloaded from http://www.oreilly.com/catalog/9780596154615/.

The Twitter API is a moving target. Things change, and as a result some of the behavior you read about here will be different from what you encounter while coding.

Twitter eats its own dog food by maintaining an account (@twitterapi) that you can follow to get information about changes to the Twitter API. The timeline began in May 2007 and contains links to discussions and updates. Doug Williams (@dougw) was hired in March 2009 specifically to help Twitter developers, and much of that contact will take place through the @twitterapi account. Another good source for this information is the Change Log maintained on the Twitter API wiki.

Accessing the API

An application programming interface, or API, is what allows an application with data to share it with the rest of the world. An API is like a no-frills website, accessed through URL requests but returning structured data instead of web pages displayed in a browser. The data returned is structured to make it easy to parse and get to the information inside. APIs also tend to separate all of the functionality of the site into single, specific actions, such as “get a list of tweets” or “change my profile picture.” By combining several kinds of requests, you can use an API to power your own custom applications.

The design of the Twitter API attempts to adhere to the principles of RESTful systems. Roy Fielding conceived of REpresentational State Transfer (REST) less than a decade ago: this approach increases the ease of development as well as the scalability and flexibility of applications by making sure the data is layered, stateless, and well defined. Switching from XML to JSON, for example, is a simple matter of changing the extension on the URL used to make the request; it isn’t necessary to reengineer the application or switch development platforms.

In this section, I’ll present you with some basic instructions on how to access the Twitter API, select from HTTP methods, authenticate your API requests, and manage imposed limits.

Note

All requests in this book use HTTPS. This is the preferred way to access the Twitter API.

HTTP Requests

The Twitter API permits three kinds of HTTP requests: GET, POST, and DELETE. The default request is submitted with a GET, which passes parameters as an encoded URL query string. For API methods that change things—for example, updating or deleting status information, direct messages, or associations in the follow network—a POST is needed.

Note

Where indicated in the API methods, the id parameter is sent as part of the URL (substitute this parameter with either a user ID or a username). No form data is needed for any of these methods.

GET

The GET method accepts a URL and uses it to retrieve something from another server, after any necessary processing is done. If the URL is an index.php web page, for example, the GET method will capture the HTML generated by the PHP, not the PHP code itself.

Header information passed to the GET method can change its behavior. In particular, GET looks at the If-Modified-Since field and captures the output only if doing so fulfills that header condition. The purpose of this constraint is to reduce redundant network activity by avoiding unnecessary data transfers.

The following Twitter API methods are accessed with a GET:

https://twitter.com/account/rate_limit_status.xml
https://twitter.com/account/verify_credentials.xml
https://twitter.com/direct_messages.xml
https://twitter.com/favorites.xml
https://twitter.com/followers/ids.xml
https://twitter.com/friends/ids.xml
https://twitter.com/friendships/exists.xml
https://twitter.com/help/test.xml
https://twitter.com/statuses/followers.xml
https://twitter.com/statuses/friends.xml
https://twitter.com/statuses/friends_timeline.xml
https://twitter.com/statuses/public_timeline.xml
https://twitter.com/statuses/replies.xml
https://twitter.com/statuses/show/id.xml
https://twitter.com/statuses/user_timeline.xml
https://twitter.com/users/show/id.xml

Note

Most of the methods in the API can be requested using GET. This means your parameters can be passed as a URL query string, or a series of name/value pairs following a ?.

This is great for testing, as all you need to do is type the request URL into the location bar in a regular browser. If the method requires authentication, a dialog box will pop up to get that information. This is more convenient than running early source code or using cURL, as described later in this chapter.

POST

The POST method does the same thing as GET, but it acquires its results in a different way. Whereas there is an upper limit on the size of a GET query string, a POST request encapsulates the submitted data, allowing more information to be transferred. It treats that bundle of data like an attachment to an email, something separate from and subordinate to the requested URL rather than part of it. Because the data is encapsulated and sent separately from the URL, POST data is not exposed in server logs.

Warning

POST data should not be treated as implicitly secure. It does help guard against simple attacks such as image-based cross-site request forgeries (see http://en.wikipedia.org/wiki/CSRF for more details), but it is only “security by obscurity.”

POST is required for API methods that actually make changes to Twitter’s servers, rather than just retrieving data. This is typical of all APIs and web forms in general, and is not unique to Twitter. In the Twitter API, the following methods require POST request handling:

https://twitter.com/account/end_session
https://twitter.com/account/update_delivery_device.xml
https://twitter.com/account/update_location.xml
https://twitter.com/account/update_profile.xml
https://twitter.com/account/update_profile_colors.xml
https://twitter.com/account/update_profile_background_image.xml
https://twitter.com/account/update_profile_image.xml
https://twitter.com/blocks/create/id.xml
https://twitter.com/blocks/destroy/id.xml
https://twitter.com/direct_messages/destroy/id.xml
https://twitter.com/direct_messages/new.xml
https://twitter.com/favorites/create/id.xml
https://twitter.com/favorites/destroy/id.xml
https://twitter.com/friendships/create/id.xml
https://twitter.com/friendships/destroy/id.xml
https://twitter.com/notifications/follow/id.xml
https://twitter.com/notifications/leave/id.xml
https://twitter.com/statuses/destroy/id.xml
https://twitter.com/statuses/update.xml
Note
POST requests to the API do not count against the rate limit.

DELETE

The Twitter API also accommodates a third protocol, the DELETE method. The purpose of this type of HTTP call is to instruct the remote server to remove the requested URL resource. There is no way for the remote client to guarantee that this has been done, however. POST requests work just as well with the API, and in this book we’ll use POST instead of DELETE.

There are only a handful of API methods that will recognize a DELETE request:

https://twitter.com/blocks/destroy/id.xml
https://twitter.com/favorites/destroy/id.xml
https://twitter.com/friendships/destroy/id.xml
https://twitter.com/direct_messages/destroy/id.xml
https://twitter.com/statuses/destroy/id.xml

HTTP Status Codes

One of the bits of information returned to the client in an HTTP request is the status code, a series of three-digit numbers used to communicate the type of success or failure encountered. The Twitter API assigns special meanings to many of these codes, which describe specific outcomes of method requests.

The following are some status codes your application may encounter, and what they likely mean in the context of the API:

200—OK: Success! The method request did what you expected it to do.
304—Not Modified: Nothing wrong, but nothing to report.
400—Bad Request: This can be caused by one of two things: either the request was formatted incorrectly (missing required parameters, unknown method, etc.), or the rate limit has been exceeded. Check the returned text for an explanation.
401—Not Authorized: The account (Twitter username or registered email address) or password you used to authenticate to the API isn’t working. Check its accuracy and try again.
403—Forbidden: Twitter understood what you want to do, but won’t let you do it. Check the returned text for an explanation.
404—Not Found: Probably caused by a typo or incorrect path to the API method you are requesting. You might also get this error when trying to request a nonexistent user.
500—Internal Server Error: The Twitter folks may be working under the hood. What you requested is probably OK, but the servers aren’t handling it correctly. Seek counsel from engineers on the Twitter Development Talk Google Group.
502—Bad Gateway: Intentional Fail Whale; Twitter is probably rolling out an upgrade.
503—Service Unavailable: Unintentional Fail Whale (see The Rise of the Fail Whale); there are too many requests for the servers to handle right now.

Twitter will try to return any error messages in the same format being requested for the data, such as this XML version:

<?xml version="1.0" encoding="UTF-8"?>
<hash>
  <error>Authentication required to request your own timeline.</error>
  <request>/statuses/user_timeline.xml</request>
</hash>

The default format is text. Twitter will always try to return some kind of explanation, if it can.

Status codes are an easy way to direct the application logic. Parsing the returned messages will provide specifics and can be helpful for passing along interpretations of errors to the end user, but status codes are easier to access from the HTTP response. They can help direct error handling or confirm success.

Note

This book uses only the XML format, with one exception: the Keyword Search API method doesn’t yet support XML, so I substitute Atom instead. The Twitter API does support other formats, as discussed next.

Format

Twitter currently accommodates four kinds of formatted data:

XML: Extensible Markup Language uses semantic tags to wrap data in a structured format. It is extensible because the user can define the structure and kinds of tags; they aren’t simply prescribed, as with HTML. Use of XML to structure data is an accepted way to separate the data layer from the presentation layer and make applications more versatile.
JSON: JavaScript Object Notation is a language-independent text format used primarily to power Ajax applications. With JSON, simple text can be used to represent many different types of data and the relationships between those data types. As with XML, the data is encapsulated in a structured format; however, JSON is considered to be simpler than XML. For more information on how to use JSON, visit http://www.json.org.
RSS: Really Simple Syndication is a specific form of XML that reflects some standardized tag structures that can be read in a predictable manner. RSS feeds are widespread on blogs, news websites, and services like Twitter.
Atom: Atom Syndication Format is an alternative to RSS that was created in part to remove the need for legacy support of older protocols. Atom uses a different date and time format than RSS and is more accommodating of modular use and international support.

We’ll use XML in this book, but to switch to JSON, simply change the URL extension from .xml to .json in the HTTP request. That’s what RESTful design principles do for you!

RSS and Atom

Only a few Twitter API methods make use of the RSS and Atom formats. The following are used on the official Twitter website, allowing people to subscribe to information streams:

https://twitter.com/direct_messages.rss
https://twitter.com/favorites.rss
https://twitter.com/statuses/friends_timeline.rss
https://twitter.com/statuses/public_timeline.rss
https://twitter.com/statuses/replies.rss
https://twitter.com/statuses/user_timeline.rss

Authentication

Most API requests require a valid username and password . Authentication is necessary for two reasons. First, some of the information available through the API is specific to the authenticated user, so the user context determines what data is returned. Second, authentication is the most reliable way to facilitate limiting the rate of access to the API. Imposing rate limits was necessary to ensure the success of Twitter, in terms of both cultivating members and encouraging third-party application development.

Note

Asking for authentication information from users is difficult to avoid for some of the functions and data available through the Twitter API. For further discussion on some of the issues involved with asking end users to provide their username and passwords, see Gone Phishing: A Word About Passwords.

Although Twitter does have plans to improve the scheme to use OAuth,^[62] as of this writing authentication is done through HTTP Basic Authentication, referred to as Basic Auth. Twitter asks for either an account username or the email address used to create the account, along with a password, before doing any of the heavy lifting in terms of data retrieval or modification. Any future changes are likely to become optional improvements, with the current system remaining fully supported.

Warning

Be aware that the user account information passed as plain text in cURL is only slightly obfuscated; it will be fully readable to anyone monitoring the network. cURL facilitates HTTPS requests, and Twitter recommends using encryption to interact with the API.

If a method is requested without a valid username and password, the data response will be XML containing the following error statement:

<hash>
  <error>Could not authenticate you.</error>
  <request>/account/verify_credentials.xml</request>
</hash>

If you are not interested in the specifics of the error message (which can easily change), examine the HTTP status code as a quicker and more reliable indication of success or failure (see HTTP Status Codes).

There are a few API methods that return data without authentication. Here’s the complete list:

View the Public Timeline: https://twitter.com/statuses/public_timeline.xml
View an Individual Timeline (public accounts only): https://twitter.com/statuses/user_timeline/14067832.xml
Show a Tweet: https://twitter.com/statuses/show/937878916.xml
Show Member Profile: https://twitter.com/users/show/id.xml
Keyword Search: https://search.twitter.com/search.atom
Test: https://twitter.com/help/test.xml

All of these unauthenticated methods require a GET request, and some (such as Member Profile) will provide more information if the user context is known when the information is retrieved.

Note

At press time, Twitter’s OAuth was in beta, being tested by the developer community. One of the testers, Abraham Williams, quickly published some sample code at http://github.com/poseurtech/twitteroauth. See the next section for more on OAuth.

A Peak at OAuth

In February 2009, Twitter released its first implementation of OAuth as a closed beta to developers on the Google discussion group. A few hours after this release, Inuda, a web application design firm, quickly showed a proof of concept with Twitter’s code.^[63] Within a week, successful tests and sample code existed for PHP, Python, and Ruby. A growing list of OAuth resources is available on the Twitter API wiki.

Among those efforts was a sample script from Abraham Williams (@poseurtech).^[64] Williams’ solution follows a straightforward process to authenticate with Twitter’s OAuth. OAuth functions by managing multiple pairs of tokens: the tokens for the specific user request, and the ones used to allow the application to later access parts of that user’s Twitter account. There is also an initial pair used to register the application.

Each application will first need to be registered with Twitter. Developers in the beta test were given a new tab in the Settings section of the Twitter website that allowed them to make this request. Twitter returns the key/secret tokens for the application to build a TwitterOAuth object from the registered URI. With the key and secret strings, the application requests tokens for the user with a new OAuth method. These tokens become part of a request link that the end user can click on to grant the application access to her Twitter account.

In Basic Auth, Twitter isn’t involved with this handoff of account access. Users share their screen names and passwords with the third-party application, which then uses them to access the API on their behalf. Not only is this an all-or-nothing level of access, but it is the same access that would be shared with other third-party applications. The result is a network of systems reliant on the same authentication information.

With OAuth, Twitter becomes the middleman in the negotiation between the application and the user. The application uses its identifying tokens—acquired through the initial registration—to request access to the user’s account in the form of a request link. This link is presented to the user on the third-party application, sort of like how a parking receipt is given to a conference attendee to take to the information desk to be validated. When the user clicks on this link, he is taken to the Twitter website, along with the tokens that identify which application will need the account access. At this point, the application is no longer involved; it is a dialogue between the user and Twitter. If the former approves access, he is taken back to the third-party application site with the token pair needed to make future API requests.

This only gets us halfway there. These request tokens are saved locally by the third-party application and used whenever more API interaction is required. The request tokens don’t grant access, but they do authorize future requests for data without the user being present. Assuming the user hasn’t revoked access in the interim, the application can present the request tokens to get access to the user’s data—sort of like giving a special pass to a bouncer to get backstage at a concert.

The big benefit of OAuth is that it avoids password-sharing behavior. Instead of asking the user to provide her actual username and password, the OAuth process results in a new form of the user authentication that can only be used by this particular application. For third-party applications, there is more accountability, since the tokens are unique. For users, there is more control, since you can revoke or deny access rights to one application while granting it to another.

Caution

OAuth is still a work in progress, not just for Twitter but also as a method and movement. By mid-2009, Twitter will be releasing its OAuth to production applications, but that doesn’t mean every application will be inclined or able to use it.

Parameters

The information returned by Twitter can be refined during the HTTP request. Most API methods accept one or more parameters, most of which are optional but some of which are required. Not all parameters are available to every method.

The Twitter API relies on UTF-8 encoding of all parameter values, which means you can’t send some special characters as plain text without confusing the machines that have to deal with that information. This is particularly important in differentiating between parameter string connectors, such as & and =, and the actual values contained within each parameter. Encoding gives the API a means to distinguish between the two. Fortunately, most programming languages—including PHP—have some functions that make encoding parameter values simple.

Note

Angle brackets (< and >), double quotes ("), and the ampersand (&) are converted to entities as a security precaution against attacks from web applications. The resulting encoded characters count toward the 140-character limit for Twitter messages.

I have gathered together all of the possible parameters, both required and optional, into one place to give you a convenient reference. Not every API method will accept all of these, but the parameters work the same way for the methods that do recognize them.

Parameters that may be required

For several of the Twitter API methods, it is not enough to just send the URL; the request also requires that some parameter value be included to let Twitter work its magic. The following are descriptions of parameters that may be required for some API methods:

id/user_a and user_b/user: A Twitter user can be referenced using either an integer ID or the username associated with that account. For status or direct messages, the ID must reflect an existing record number. The id (or similar) parameter is passed as part of the URL request.
status/text: The text for a status update or direct message is limited to 140 characters after URL encoding.
location: The location of the user is not standardized in any way. Any encoded text can be published to the location field in the user’s profile.
device: The device must be one of the two valid options supported by Twitter, namely sms or im. To turn off device notifications, the value should be none.
Note
Although this parameter allows for future expansion, in practice it can only be used to turn cell phone messages on or off—Twitter officially discontinued IM support in 2008.
q: (Search only.) Although the Keyword Search method will return header information if you don’t include a query string, any API requests without this parameter are almost meaningless. The q string must be URL-encoded.

Other parameters that may be useful

The power and efficiency of the API are increased when you send additional instructions to Twitter to shape what information is returned. In addition to any parameters that a given method may require, there are usually additional options that can be set to filter data before it is returned to you. The following list describes parameters that may be optional for some API methods:

id, in_reply_to_status_id

Even when it isn’t required, the value of id will still be either an integer ID or the username associated with the account when referring to a person (for user methods), or an existing record number when referencing a message (for status methods). The id parameter is passed as part of the URL itself, not as a separate query string or POST field value.

The in_reply_to_status_id parameter references the specific status update by another user with which a tweet is associated.

email

A user’s email address can be used if you do not know the user ID or username. This parameter is also used to edit the member profile and change which email address is associated with a given account.

user_id

This parameter overrides id (which can contain either a user ID or screen name) with the ID of the specified member account. It was added to prevent ambiguities when the screen name is a number.

screen_name

This parameter overrides id (which can contain either a user ID or screen name) with the screen name of the specified member account. It was added to prevent ambiguities when the screen name is a number.

since

This parameter can be used to limit results to the most recent activity, performing the same function as the If-Modified-Since header in an HTTP request. This filter ignores information older than the specified time (which must be within the last 24 hours). The value must be encoded to be in the form Tue%2C+20+Jan+2009+11%3A30%3A00+GMT.

since_id

This parameter functions similarly to since, except that it filters based on the ID of a specific status update or direct message instead of a date. Twitter returns only the records with IDs greater than the specified value (i.e., records that postdate the message with the specified ID).

count

For timeline requests, count limits the results to the n most recent status updates, where n is the integer value specified in the request. The maximum allowed count value is 200.

rpp

(Search only.) This parameter (results per page) specifies the number of status messages to return on each page, given a specified search term. The maximum allowed rpp value is 100.

page

This integer value paginates the Twitter results for status updates, direct messages, and members of a follow network. For status updates, each page contains up to 20 items. For followers and people you follow, each page holds up to 100 authors. The Keyword Search method also uses this parameter, but it allows the application to dictate (with rpp) how many of the 1,500 matching tweets are displayed at a time (and therefore how many pages are needed to browse the full corpus).

Note

The page parameter always begins at 1, which is the default, not 0.

follow

This is a Boolean value that indicates whether you want to be notified on your cell phone or some other device when the user indicated by the id parameter posts a status update. This parameter is used to enable notifications at the same time that you begin following a new person.

name

The full name of a member is often more readable and meaningful than the user account handle. This option allows you to change the full name listed on a member account. The maximum length is 40 characters.

url

Each member can associate her account with a single link to a website. This parameter allows you to change the URL listed for a member account. The maximum length is 40 characters.

location

The location of the user is not standardized in any way: any encoded text not longer than 30 characters can be published to the location field in the profile. This parameter allows you to change the location listed for a given member account.

description

This parameter specifies the text (maximum 160 characters) describing a member or organization using Twitter. This description shows up on the member’s Twitter profile web page.

image

The background image to display or tile behind your Twitter member profile web page can be controlled with this parameter. The image must be a GIF, JPG, or PNG and cannot exceed 2,048 pixels or 800 KB. For the profile picture associated with all of your tweets, the maximum values are 500 pixels and 700 KB.

profile_background_color profile_text_color profile_link_color profile_sidebar_fill_color profile_sidebar_border_color

These parameters control the web page color scheme for a user’s Twitter member profile. Each option must be specified using a valid hexadecimal code (as in “#f09” or “#ff0099”). The colors are set through the update_profile_colors method (see Update Profile Colors) and dictate how most of the text, links, borders, and shading are displayed.

show_user

(Search only.) When set to true, this parameter tags the beginning of each tweet that is returned in the search results with a username and a colon (e.g., “kmakice:Writing”).

geocode

(Search only.) This parameter filters the search results by location, using the self-disclosed location information in the author profiles. The geocode parameter has three parts: latitude, longitude, and radius of interest. The resulting comma-delimited string must be URL-encoded, as in 39.123456%2C-86.345678%2C10km. The radius must be specified in units of either mi (miles) or km (kilometers).

Note

Twitter members use the location field in different ways. Most of the time, the information is accurate—thanks in part to the propagation of smart phones, like the iPhone, among Twitter users—but it can also be out-of-date or nonsensical (e.g., “Space”).

lang

(Search only.) This parameter filters the search results by language, using an accepted ISO 639-1 code such as en, es, or fr.

callback

(Search and JSON only.) A callback allows a program to pass a reference to a dynamic function on the application side. Because we are focusing on XML in this book, we won’t use callbacks.

Rate Limiting

In late 2007, use of Twitter’s API reached a sufficient level that some throttling of requests had to be instituted. Clients are now permitted only so many requests every 60 minutes, measured from the time the first request is made. The current level is 100 requests each hour, but the limit fell as low as 30 during the worst of the server strains in the summer of 2008.

Two kinds of rate limits are being tracked: authenticated and unauthenticated accesses. When you authenticate with a valid Twitter username and password, the API starts tallying when you use the API and charges requests against the rate limit for that account. When you are able to get data without authentication, requests are tracked for the IP address you’re using.

Note

In theory, this might allow you to double your rate limit to 200 GET requests per hour, provided no authentication is needed for the data you are interested in getting. If you really have need for that many accesses, however, I suggest requesting whitelisting from Twitter, as discussed later in this section.

In general, any method using POST is exempt from rate limiting. This includes any request where the server data is changed, such as requests related to adding or deleting status updates, direct messages, or associations in the follow network.

When a request is made that exceeds an account or IP address’s rate limit, a data response is sent that indicates the error:

<hash>
  <request>/statuses/user_timeline.xml</request>
  <error>Rate limit exceeded. Clients may not make more than 100 requests
    per hour.</error>
</hash>

Until an hour has passed since the first of those 100 requests, no more data will be accessible. A status code of 400 is also returned, providing another indicator that a problem has occurred without it being necessary to parse the XML, JSON, or plain text containing the specific error message.

Note

The search API is currently handled a little differently than the rest of the Twitter API, due to its history as a third-party application (Summize). Future versions of the Keyword Search method may be similarly limited, but for the time being, Twitter simply monitors for abuse and acts on a case-by-case basis. Any rate limits for the search API are “a bit fuzzier” than the main Twitter API and are based entirely on IP address.

Checking the rate limit status

To help developers manage their access to the API, Twitter created a special method to return information about the rate limit status of the authenticating account:

https://twitter.com/account/rate_limit_status.xml

Requesting this method does not count against the rate limit and can provide useful information about the current maximum, the number of hits remaining in the current hour, and when the clock will be reset (both in absolute clock time and elapsed seconds). The Check Rate Limit Status method is discussed in API Administration.

Whitelisting

For most uses of the API, developers can work within the rate limit. However, in some cases, an application will require more than 100 requests at a time in order to provide its functionality. Twitter sometimes makes allowances for this by adding user accounts to a whitelist, where limits are raised or eliminated.

If you are developing an application that requires a lot of requests to the API, you can submit a web form (https://twitter.com/help/request_whitelisting) and ask Twitter to be considered for addition to the whitelist of high-volume screen names. This will raise your upper limit from 100 to 20,000 API requests per hour.

Keeping Development Light

In the past, Twitter engineers have taken a beating in the blogosphere about their ability (or rather, inability) to keep the service running under stress. Although much of the traffic to Twitter comes from the API, outages have rarely been attributed to API traffic. One of the reasons for that may be the developer community’s willingness to play nice with the servers.

Note

At a gathering of regional API providers, Twitter’s Alex Payne hinted that Ph.D. students grabbing data is the biggest problem APIs face.^[65]

In addition to working within the restrictions the company puts on accessing its data, Twitter advises adopting a few other strategies:

Load the minimum: Get more data only when the action is triggered by user interaction.
Maintain a local archive: If you will need to request the same kind of information again, look to your local copy first to avoid repeatedly asking for the same content.
Paginate: Make use of the page parameter rather than count. If you use count, combine it with the since_id parameter to retrieve only new information.
Identify yourself: Set the User-Agent header in the HTTP request to help engineers troubleshoot your applications.

Finally, give credit where credit is due. A simple “Powered by Twitter” link to the service ties your application back to the larger community. Look at the Terms of Service section in the Twitter API wiki for more information.

Note

The Twitter API wiki organizes the methods differently from the way I do in this chapter. If you want to see a simpler list of methods in structural order, including a summary of parameters you can use, check out the Appendix A.

Play Along at Home

The best way to learn the Twitter API methods is to try them out as you read through the material in this chapter. One way to test how the API methods work is through the computer command line, using cURL commands:

curl: Invoke the cURL connection.
-u username:password: Authenticate with your Twitter username or ID and password. The username can be substituted with an email address, as only one account is associated with each address.
-d status=your+message+here: For methods that require a POST, this option is needed to get a valid response from Twitter. The text used in conjunction with the POST option sends parameter data to the API to be processed with the request.
https://twitter.com/category/methodname.format: Provide the method URI requested from the Twitter API.
Warning
If cURL returns an error when you use an HTTPS request (specifically, “certificate verify failed”), you can disable the verification of the secured certificate by using the -k option.
This is not an ideal solution, though; I include it here only for convenience. Turning off verification of secure certificates can defeat the purpose of HTTPS encryption. It is better to adjust the certificate bundle to include what you need. Visit http://curl.haxx.se/docs/sslcerts.html for more information on how to do this.

These commands can be strung together to produce a new status message in the public timeline, as in:

curl -u username:password -d 'status=test' https://twitter.com/statuses/update.xml

If cURL is not installed on your system, you can download it from http://curl.haxx.se/download.html for almost any OS. For more information on using cURL, see cURL.

If you are hesitant about using command-line cURL to test your Twitter API method requests, there are other options to help you see the HTTP status codes and content passing between your machine and the rest of the world.

Charles (http://www.charlesproxy.com) is a debugging proxy ideal for investigating HTTPS traffic and XML interactions that travel to and from your machine. You can download the software for a free 30-day trial and then pay $50 for the full license. The simplest way to look at the Twitter API responses, though, is probably through a regular browser. Simply type the method request into the browser’s location field. If authentication is needed, the browser will ask for it.

Note

Most of the methods in the API can be requested using GET, but not all of them. The methods that make changes to Twitter data, such as posting a new status update or deleting a message, won’t work with GET. They require POST and won’t be usable through a browser.

The API Methods

You may find it easier to navigate the API if we focus on which part of Twitter each method affects. The 38 API methods currently maintained by Twitter can be organized into the following operational groups based on what they’re used for:

Publishing: Changing the content published to Twitter
The Information Stream: Retrieving and managing the content published to Twitter
The Follow Network: Managing the people whom you follow and who follow you
Communication: Exchanging direct messages with other members
Member Account: Dealing with your Twitter account
Administration: Negotiating access to the Twitter API
Search: Looking for keywords in the tweet archives

The sections that follow explore each group, describing how to format requests with each of the API methods and showing some examples of XML output.

Note

Remember that successful requests return an HTTP status code of 200. You can use this to check how you did if you don’t want or need to parse the data response from Twitter.

Publishing

Twitter is nothing without its 140-character posts, or tweets. The API methods in this category manage the creation and removal of tweets. Quite simply, you can use these methods to publish content to or remove it from the Twitter information stream.

Note

When successful, the publishing methods return status objects. See Status Objects for more details about the data that is returned.

Post a Tweet

This method adds a tweet to the information stream for the authenticated user. In Twitter terminology, this is an update of the current member’s status:

https://twitter.com/statuses/update.xml

To make this URL request function, authentication is required (so that the new status message can be assigned to the correct account). Since it involves a change to the service database and not simply a data grab, the POST method is required to encapsulate the parameter data in this request.

The Post a Tweet method requires one parameter, status, filled with encoded text of no longer than 140 characters. Omitting the message results in no status being published and subsequently no content being returned. When successful, Twitter returns XML containing information about the new status update (see Status Objects).

To identify which tool published the tweet—as in “from the web” or “from twitterrific”—there is an optional parameter, source, that can contain the short identifying string registered with Twitter upon request. There is also an optional parameter, in_reply_to_status_id, that can attach this tweet to another specific status update by another user. This will associate the message with the author specified in the in_reply_to_user_id attribute in the status data object. If the reply status ID is not valid, the parameter will be ignored.

Warning

Many older Twitter applications assume that a new message sent as a reply is a response to the most recent status update by the targeted author. For active accounts, this can cause replies to be attached to more recent updates than are intended, causing some issues in the integrity of threaded conversation.

Twitter has added elements to support better threading for replies, but the association to the replied-to author’s last tweet is still programmed into many third-party Twitter tools.

Delete a Tweet

After a status update has been published, it can be deleted. However, this can only be done if the authenticated user is also the author of the update to be removed:

https://twitter.com/statuses/destroy/id.xml

The id parameter is required to identify which existing status update is to be deleted—information that is included by referencing the status ID in the request URL (replacing id in the preceding link). If the request is successful, Twitter returns the status object information. This is the same response posting a new tweet will produce (see the previous section). If the status ID is not provided or is invalid, the XML returns an error: “No status found with that ID” (see Hash Objects).

Deleting a tweet is a practice that should be discouraged. Part of the value of Twitter is that it provides a long-term record of the little things we do and say, including the mistakes. Although there will almost certainly be times when you will need to remove a tweet you’ve posted, it is important to realize that your timeline is not an inbox. Leave your footprints for posterity.

Warning

There are some known issues with deleted status updates still appearing in Twitter searches. This problem should be resolved with the next upgrade to the Twitter API, expected in 2009.

The Information Stream

The collection of published tweets creates a flow of information in Twitter. There is a public timeline of updates, fed by all accounts that are not configured to be private streams and are therefore available for anyone to view. Individual members also have their own streams, which they craft by deciding which members to follow.

This section describes the methods used to access different kinds of streams, display details about specific status updates, and manage the bookmarks—called favorites—that mark content of interest.

Note

When successful, the Information Stream methods return status objects. See Status Objects for more details.

Show a Tweet

This method was created to enable us to view the details for a single status update:

https://twitter.com/statuses/show/id.xml

This type of request returns XML structured in the same way as that returned by the publishing methods described earlier, including a description of the update author as well as the message itself (see Status Objects). If the status ID is not provided or is invalid, the XML returns an error.

Note

Remember, you can easily change the format to JSON by editing the extension used in the URL to request the method, from .xml to .json.

Authentication is not needed, provided the tweet is public. To view protected status updates made by private authors who have authorized you to do so, you must provide a valid username and password.

View the Public Timeline

This method returns the 20 most recent status updates from public accounts in Twitter:

https://twitter.com/statuses/public_timeline.xml

Authentication is not required and thus retrieval from the public timeline does not count against the API rate limit, even if you do authenticate.

A successful request returns information about the recent status updates, following the same format and structure as that returned by the publishing methods, but with multiple status data objects contained within a <statuses type="array"></statuses> XML wrapper (see Status Objects).

Note

The public timeline is cached once per minute, so there is no reason to request public tweets more often than that. This means that, with over a million updates posted to Twitter every day, most tweets will slip through the cracks. Twitter has grown too big to capture everything through the main API.

Twitter does offer two other options: a data mining feed that returns 600 tweets per request, and the “firehose” that gives researchers everything coming across the timeline. See Other Data Options for more information.

The public timeline is not a complete picture of all Twitter traffic. An estimated 10% of all user accounts are private accounts;^[66] those users’ status updates are available only to approved people and are not included in the public timeline. Additionally, Twitter requires that an account be minimally configured to include a custom user icon for it to be part of the public timeline.

View a Friends Timeline

Similar to the one for the public timeline, there is a method in the API to retrieve recent tweets from the perspective of a specific user. Calling the View a Friends Timeline method returns the 20 most recent status updates posted by the authenticated user and the authors that user follows:

https://twitter.com/statuses/friends_timeline.xml

This data is the same stuff you’ll see on the home page after logging into the Twitter website. A successful request returns XML structurally identical to that of the data returned by the View the Public Timeline method.

There are several optional parameters that can be used to filter the data that is returned. Three of them—since, since_id, and count—change the number of status updates returned by truncating older tweets. The since parameter gives the API a certain point in time to use as the cutoff, whereas the since_id parameter identifies a specific status ID (presumably the last one your application successfully captured). The count parameter specifies the number of recent tweets to return. When creating applications that monitor the activity of an account, these are great parameters to use to avoid redundancy.

The page parameter allows the application to navigate further back than just the 20 tweets available on the first page of the friends timeline. Current restrictions maximize the archive at 200 tweets, which translates to 10 pages of timeline content.

View an Individual Timeline

A third view of the Twitter timeline is the user archive. This stream contains just the tweets published by a single author. The View an Individual Timeline method returns the 20 most recent status updates posted by the authenticated user:

https://twitter.com/statuses/user_timeline.xml

It’s also possible to request another user’s timeline by adding another level to the URL path and identifying that user, as in:

https://twitter.com/statuses/user_timeline/id.xml

The id parameter is replaced with the user’s ID or username. This is the same content one would see by visiting a member’s Twitter profile page.

Note

If the requested user has a public account, authentication isn’t necessary; simply reference the user ID or username in the request.

Without authentication or inclusion of a public user ID, this method returns a variation of the standard error message: “Authentication required to request your own timeline” (see Hash Objects).

Successful requests respond with an array of status data objects (see Status Objects), as is the case with other timeline methods. The results can be filtered in the same way as those for the friends timeline, using since, since_id, count, and page.

View Replies

Replies are status updates that reference another Twitter member. The convention, culled from the old days of Instant Relay Chat (IRC), is to precede the username with the @ symbol. Although Twitter wasn’t originally meant for conversation, many people post status updates as replies to direct conversation to particular users.

These replies can be viewed as a separate timeline by using a special API method:

https://twitter.com/statuses/replies.xml

This method returns the 20 most recent @ replies addressing the authenticated user. These will include status updates posted by people the user is not following if the account configuration is set to include replies from all users.

Note

Replies are recognized in Twitter only if the reply indicator appears at the beginning of a tweet. Any @username references within the body of the status update will not be included in the replies timeline. However, any reference to a particular user can be found using the search API (see Content searches).

Twitter currently allows retrieval of up to 40 pages, or 800 replies, by using the optional page parameter. This method also facilitates retrieval of the freshest replies by recognizing the since and since_id parameters.

Successful requests return the standard status data object array, as with the timeline methods discussed previously.

View Favorites

In Twitter, you can create a bookmark (called a “favorite”) to mark a status update you want to remember. There is a separate view of the timeline that returns the 20 most recent “favorited” statuses for the authenticated user:

https://twitter.com/favorites.xml

As with the regular user timeline, you can specify an id in the request URL to get a list of favorites for a user other than the one whose credentials were used to authenticate to the API:

https://twitter.com/favorites/kmakice.xml

If that user has a public account, the information can be retrieved without authenticating. Pagination through the page parameter is the only option available to let the application navigate back to see the full list of all favorited tweets.

Create a Favorite

To create a new bookmark, you must specify the ID of the status update to be favorited in the request URL:

https://twitter.com/favorites/create/id.xml

This command returns a single status data object for the favorite message when successful, and a “Not found” error if the status message doesn’t exist or is inaccessible to the authenticated user. The new favorite will be associated with the authenticated user.

Delete a Favorite

Deleting a bookmark doesn’t delete the status message (how could it, if you’ve marked other people’s tweets as favorites?), but it does remove the flag you have previously set to remind you how much you liked the content. Times change, and you may need to distance yourself from the memory of a particular update. By using this method, you can un-favorite a specified status message:

https://twitter.com/favorites/destroy/id.xml

The id of an existing message must be included in the request URL to indicate the message you now want to forget. If you try to un-favorite a message that doesn’t exist or isn’t currently a favorite of the authenticated user, a “Not found” error message will be returned (see Hash Objects). Success brings one last reminder in the form of a single status data object that describes the message and its author.

The Follow Network

Twitter isn’t just about posting updates when you go to lunch. The magic happens when you grow your network of authors to the point where you start benefiting from the collective wisdom of your tweeps. Those you follow and those who follow you together make up your follow network. This section looks at methods that show you who is in your network, how to expand or contract it, and how to protect it from spammers.

Note

When successful, the follow network methods return information about particular Twitter members. See User Objects for more details about the data Twitter makes available.

The notable exception is Confirm a Follow, which returns a response object indicating whether or not one person is following another.

Show Member Profile

People are the main unit of currency in a follow network. Each member contributes to the collective wisdom, by talking about what is important to them at a given moment and also by identifying other people of interest. When you encounter a new person on Twitter, looking at her profile is one way you can decide whether you want to follow her.

The Show Member Profile method gives you all the basic profile information you have already seen attached to the status data objects, plus a lot of new information about the specified user. Most importantly, this is the method you need to use to find out statistics such as the number of updates a user has posted and how many people that member is following:

https://twitter.com/users/show/id.xml

Even to access your own profile information, the id needs to reflect your username or user ID and be included in the request URL.

Note

The /show part of the URL isn’t needed. These two URLs return the same results:

https://twitter.com/users/show/id.xml
https://twitter.com/users/id.xml

Alternatively, you can use the email parameter to get profile data, as in:

https://twitter.com/users/show.xml?email=kmakice@gmail.com

Doing so allows you to look up a Twitter member’s account information by referencing the email address used to register that account (for public accounts, authentication is not required). Private accounts are protected from this search, unless the authenticating user has been granted access to those accounts. Twitter also added two other parameters—user_id and screen_name—that can identify which member account you want returned. Either can be used as a query string variable (e.g., https://twitter.com/users/show.xml?user_id=415) to help disambiguate between user IDs and screen names that are composed entirely of numbers. id assumes a number is a user ID and returns that, which made it impossible to view some profiles. The two newer parameters were added to give you more control and in fact take precedence over use of id.

Note

Accounts are sometimes disabled. When that happens, the Twitter API will return a status code of 404, rather than showing any profile information for that account.

The detail in the XML gives you a lot of information about the design of the specified member’s profile page, including the style scheme used on the page and whether it has a background tile. Although you may not be too interested in what the member’s profile page looks like, you also get details on the user’s time zone setting, the number of status messages that user has bookmarked (favorites), the total number of updates he has posted, and the number of other people he is following. The profile data also includes information about the member’s latest update (see User Objects).

If the request is sent with authentication, a couple of extra fields are available in the XML:

  <following>false</following>
  <notifications>false</notifications>

This information has to do with the relationship between the two Twitter users (the authenticated user and the user identified by the id parameter). In this example, the authenticated user is not following the requested user. This information will appear for unauthenticated requests as well, defaulting to a following value of true and a notifications value of false.

Note

For protected accounts, these fields and the embedded short-form status object are not included. This was a bug fix to improve security for twitterers with private accounts. It was revealed when someone used the short-form status object to identify the “billionth tweet,” which was published by a member with a protected account.^[67]

View Members Being Followed

The more important half of the follow network is the list of people you follow. Although it is great to have a throng of devoted fans hanging on every word you tweet, your experience with Twitter will be affected much more by the content you see than by how many people read your own tweets.

The API provides a method to show who is contributing to your personal information stream:

https://twitter.com/statuses/friends.xml

The following lists are available for any public account. Simply including the id parameter in the request URL (username or user ID) without authenticating will allow you to look at how another member’s information stream is composed:

https://twitter.com/statuses/friends/kmakice.xml

Accessing this information for a private account, however, requires not only authentication but also the permission of the account holder, which you gain by virtue of being allowed to follow that user’s status updates.

The View Members Being Followed method returns a list of up to 100 Twitter members that the authenticating or identified user is following, with the members who have updated most recently appearing first. If the user is following more than 100 members, you can access the full list by using the page parameter to navigate to less active authors. Each successive page will include the next 100 users, until the list of followed authors is exhausted.

This method also allows use of the since parameter, which is quite useful for keeping tabs on just the latest changes to a following list. You can specify a URL-encoded date and time (no more than 24 hours old) and have Twitter return only the latest additions. A successful request returns an array of user data objects (see User Objects).

Note

One shortcoming of the API is that you can’t easily track changes in the network over time. Daily monitoring of an account using the since option will let you see how a member’s following list grows over time, but not how it shrinks.

Each user’s Twitter profile information can be found in the structured data, including a flag that indicates whether the account is private (<protected>false</protected>) and how many followers that person boasts.

View Followers

The other half of your follow network consists of the people who find the minutiae of your life so interesting that they decide to hang on every word you tweet. Followers are people who include your content in their information streams, and there is a nice method in the API to request the list of followers:

https://twitter.com/statuses/followers.xml

A successful request again gets the array of user profile data, but the ordering is different from that of the following list: Twitter currently lists followers according to when they signed up for Twitter, with the newest members appearing at the top of the list.

Warning

The sort order for such lists is subject to change. Pay attention to the Twitter API Change Log for information about adjustments or new parameters, especially if the order is factored into your code.

If you have a large number of followers, it’s likely that while a lot of information about your new followers will be on the first page of results, not all of it will be. Any time an established Twitter user follows you, that information will be buried deep in the pagination; to find it, you’ll have to use the page parameter.

The list of people who follow you is a much more guarded secret than the list of those you follow. For starters, you must be authenticated in order to view anyone else’s list. If you are not, or if the person whose information you are trying to get has a protected account and is not someone you follow, an authorization error will result:

<hash>
  <error>Not authorized</error>
  <request>/statuses/followers/cmakice.xml</request>
</hash>

About half of the members of the Twitterverse will have more followers than people being followed. The latter is in one’s control, but the former is not. For that reason, exploring someone else’s following list—where everyone has been vetted in some minimal fashion—is probably more useful than looking at their followers.

Get All Followers

One of the big obstacles facing developers interested in social graphs of Twitter activity is how to identify all the members of a follow network. Until recently, the only way to get a list of everyone following a given member was to loop through multiple requests of the View Followers method. Now, there is an easier way to do this:

https://twitter.com/followers/ids.xml

This method returns a simple XML list of the user IDs of all of the authenticating user’s followers. No other information is included, such as those members’ latest tweets or even their usernames. This method makes it much easier to keep tabs on social relationships, but the trade-off is you don’t get any extraneous information.

Note

User IDs are the preferred way to keep track of Twitter members. Members can change their usernames via the Settings on the main website, but their user IDs will never change.

The Get All Followers method requires authentication. To get a list of followers for someone other than the authenticating user, add that member’s username or ID to the request:

https://twitter.com/followers/ids/id.xml

If you don’t authenticate with the API request, you’ll get a message like, “Could not authenticate you.”

Get All Friends

The same kind of method is available for the other half of the follow network, too. You can get a simple XML list of the user IDs of all the people you choose to follow by making a single API request:

https://twitter.com/friends/ids.xml

This method does not require authentication, unless the person whose list you want to access has a protected account.

Note

A good way to make use of these two social graph methods is to conduct separate lookups for user profile information for only those members your application doesn’t already know about.

Follow a Member

To build your follow network, you need a method for adding a friend. This method allows you to do that, as well as to optionally add this person to your notifications list to receive their tweets on your cell phone.

Note

Back in the olden days of Twitter, those you followed were called “friends.” That’s why the API methods use the terminology “create friendships,” even though on the website and elsewhere this action is now referred to as “following.” The next version of the Twitter API will correct this, but thanks to backward compatibility, friends are likely to be a permanent part of Twitter programming.

To follow another user, you need to create a relationship between the authenticated user and some other user, identified with the id parameter:

https://twitter.com/friendships/create/id.xml

What comes back when this request is successful is the short-form profile information for the new friend. The request will only be successful, though, if the relationship doesn’t already exist. If you try to follow someone already on your list, Twitter returns an error: “Could not follow user: id is already on your list” (see Hash Objects).

There is an optional parameter—follow—that lets you have a two-for-one by automatically adding your new friend to your notifications list. If this parameter is present and set to a value of true, this member will have notifications enabled so you can immediately start receiving her tweets on your cell phone. This is the same action performed by the Turn On Notification method described in Member Account.

Note

There is often confusion about the difference between the Follow a Member method and the one dealing with devices (Turn On Notification). The Follow a Member method establishes the network relationship but does not by default enable notifications. Members for whom notifications are enabled (via either the follow parameter to this method or the Turn On Notification method) form a subgroup of the people you follow whose content is sent directly to your registered device (usually a cell phone).

Notifications are useful for having status updates from close friends or family members sent directly to your phone; you can stay in close contact with a select few people and follow many more users casually through the Web or third-party desktop tools.

Following another account has a few side effects. First, that person’s content is included in your information stream. This happens immediately if the account is public, but only after approval if the account is protected.

Note

If you follow a protected account, the response will be nearly identical to the one you get when following a public account. The only hint you will get that you have to wait for approval is the presence of <protected>true</protected> in the XML output.

Users with private accounts who have not approved you will still appear in your following list, even if you cannot yet see their content.

Second, most accounts are configured to send a notification when a new follower is added. This is an effective way to expand your follow network, because odds are good that someone interested in what you are tweeting will be interesting to you. Conversely, it is possible that the act of following someone will result in that person following you.

Unfollow a Member

Does your new friend tweet too much about Barney the Dinosaur? No problem. Simply unfollow that person to remove his content from your personal information stream:

https://twitter.com/friendships/destroy/id.xml

The format and response are the same as for the follow request, except you are removing rather than adding a tweep. If you are not already following that member, then Twitter sends an error: “You are not friends with the specified user” (see Hash Objects).

Unfollowing is one of the basic rights of a Twitter member and one of the things that makes the experience so rich. The sharing of information on Twitter is technically decoupled, so you can choose to follow someone who doesn’t follow you. Although reciprocation happens frequently, it isn’t a requirement in Twitter as it is in Facebook and other social networks.

Note

You do not get a notification when someone unfollows you. Your first clue that this has happened may be if that user then follows you again, resulting in an alert notification. For a third-party solution to this problem, read about Qwitter in Tools for the Follow Network.

No two information streams are alike, and the same goes for users. What might seem to you a paltry rate of tweeting could be overwhelming to someone else. Use the Unfollow a Member method to adjust your flow of status updates, and try not to take it personally when someone chooses not to follow you.

Confirm a Follow

Wading through all of the pages of your follow network lists, or even requesting profile information for another member to find out whether you’re following that person, is an awkward way to investigate the connections in your follow network. Fortunately, Twitter provides a simple method that can be used to check whether one user is following another:

https://twitter.com/friendships/exists.xml

This method requires two variables—user_a and user_b—that contain the user IDs or screen names for the two Twitter members whose relationship you want to confirm. For example:

https://twitter.com/friendships/exists.xml?user_a=amakice&user_b=kmakice

If the check is successful, Twitter responds with simple output (see Response Objects).

If the first user is not following the second—because either the relationship or one of the users does not exist—the value of the returned field will be false. If the users are not included in the request, Twitter will return an error: “Two user ids or screen_names must be supplied” (see Hash Objects).

Since the follower relationship is not coupled (i.e., you can follow someone who doesn’t follow you), the order of these two IDs is very important. In an asymmetrical relationship, you will get opposite results depending on which member is listed first as user_a.

Block a Member

The great thing about Twitter is that you can choose to unfollow anyone you find too annoying, too noisy, or too obsessed with the sweet, sweet taste of Edwardo’s Pizza in Chicago. Sigh. Those aren’t necessarily reasons to block someone, however. A block is a more serious way to distance yourself from another user.

When you block a user, you aren’t blocking her from seeing your updates. A public account is still a public account. Your status updates, the list of people you follow, and your profile information all remain readily accessible to that user. However, blocking does make it impossible for that person to follow you and keeps your Twitter icon off her profile page.

Note

Twitter reportedly monitors blocking, so if a user receives enough blocks, that user’s account will be flagged for investigation.

Blocks should not be used lightly. They are appropriate for the worst account spammers and those who may be personally harassing you. For anything else, you’re only an unfollow away from peace of mind.

The API method to create a block requires authentication and inclusion of the dissed user’s id in the request URL:

https://twitter.com/blocks/create/id.xml

If this action is successful, the short-form profile data object for the blocked user is returned.

If you block another user, that user will not be notified. Twitter is intentionally subtle in how it conveys the action back to the victim of a block: if that person tries to follow you, he will get an ambiguous error on your profile page. However, the API is much more straightforward about what has happened, providing this message: “Could not follow user: You have been blocked from following this account at the request of the user” (see Hash Objects).

For private accounts, of course, the de facto state is that everyone is effectively blocked. You can only include a private member’s status updates in your information stream if the private account holder gives you the OK.

Remove a Block

All good things must come to an end. This, too, shall pass. What goes around comes around. Pick your colloquialism, but what I’m trying to say is: blocks aren’t permanent.

You can remove a previous block on another user by using a different method in the API:

https://twitter.com/blocks/destroy/id.xml

This authenticated POST request will find the user account specified in the URL (id) and remove the block that prevents that person from following your status update stream. When successful, it returns the short-form profile for the specified user.

Imposing a block removes any follower relationships between the authenticated user and the victim of the block, so removing a block on someone you used to follow will not reinstate that relationship—the user will remain out of your stream until you follow her (or she tries to follow you) again. The user will not be informed that the block has been removed; she will only realize this if she’s told or if she tries to follow you again and succeeds.

If you choose to follow someone whom you currently have blocked, the block will be removed automatically. You do not have to explicitly remove the block in order to follow the victim; that happens automatically with a follow request.

Communication

Twitter is not a chat client. Although the culture and the technology recognize replies, the contextual nature of each individual information stream makes it somewhat rude to carry on extended 1:1 conversations with another user whose messages may or may not be included in your other followers’ information streams. For such personal communications, the direct message is your friend.

In addition to the public status update channel, Twitter provides a back channel messaging system that allows you to communicate directly with another member. This can be useful for extended exchanges, for personal questions not meant for public consumption, and to facilitate third-party applications without mucking up the information stream.

Note

To send messages directly to and receive messages directly from another user, you must be following each other. So, although status updates are decoupled (even for private accounts), direct messages require a mutual handshake.

This section covers the creation of new messages, the listing of sent and received messages, and the deletion of existing messages.

Note

When successful, the communication methods return information about the Twitter messages. See Message Objects for more details about the data Twitter makes available.

List Received Messages

A separate tab in the Twitter interface handles the flow of direct messages, which will occur at a much slower rate than status updates. You can set your account to send you direct message notifications via text or email, and clients such as Twitterrific use the API to bring the messages into a single information stream.

There is an API method to list all of the messages the authenticated user has received, which are returned in groups of 20:

https://twitter.com/direct_messages.xml

This method recognizes some navigation and filtering parameters to control which page you retrieve (page) and to look for only the most recent messages (since, since_id).

The returned list is an array of direct message data objects, each with three distinct parts. The information about the message has some overlap with the information for a status update, in that it provides a record ID, the message text, and the creation date. In addition to describing the author, though, it must also report who the recipient is. Embedded in the direct message object is the familiar short-form profile information about both the sender and the receiver of the direct message (see Message Objects).

Since direct messages are not as common as status updates, it is not unusual for an account to have no record of direct messages being received. In that case, the API returns an empty object in the XML (see Response Objects).

List Sent Messages

Twitter also lets you keep track of the direct messages you have sent. A second messaging method lets you access your outgoing message archive:

https://twitter.com/direct_messages/sent.xml

Similar to the method for listing received messages, this method returns the 20 most recent direct messages sent by the authenticated user. The same optional parameters (since, since_id, and page) are available to adjust which part of the full list of messages is pulled out of the API.

Create a Message

You can create a new direct message using a special method provided by Twitter:

https://twitter.com/direct_messages/new.xml

This method requires two parameters: the user parameter identifies who will receive your private bit of wisdom, and the text parameter is your private bit of wisdom. The message must be URL-encoded and is limited to the signature 140 characters. Like all requests to change the Twitter database, this method requires a POST.

If successful, Twitter returns a single direct message data object containing the same information described in the communication methods discussed earlier. Mistakes in the format of the request will result in an “Invalid request” error. Because the sender and the receiver of the message must be following each other, Twitter will return an error if that is not the case: “Can’t send direct messages to users who aren’t your friend” (see Hash Objects).

You can also send direct messages using the Post a Tweet method:

https://twitter.com/statuses/update.xml

You must start the status text with d username to direct the message to the person you want. This is the same function the text-based commands perform in Twitter clients.

Delete a Message

Not everyone likes to keep a permanent archive of all of the little messages they send and receive. There is no search mechanism for these messages, so it is sometimes helpful to prune out the ones you don’t want to keep (assuming you want to keep any at all).

Deleting an existing direct message is as easy as using another API method:

https://twitter.com/direct_messages/destroy/id.xml

This method follows more or less the same procedures as that of a status update removal: you identify the ID of the message to be removed and authenticate using the account credentials of either the sender or the receiver. If successful, you will get the message information as a data object.

Until recently, only the receiver of a message had the power to delete it. However, you are now also able to delete messages that you yourself sent—the recipients are no longer the only ones with the power to remove your missives from the face of the Twitosphere.

Warning

If you do decide to delete a message, understand that it will disappear from both the sender’s sent message list and the receiver’s received message list. There is only one copy of each direct message, and either person involved in the exchange can remove it.

You can only destroy what has first been created. If you send a request to delete a received message using an invalid ID, Twitter will let you know it can’t do it: “No direct message with that ID found” (see Hash Objects).

Show a Message

Surprisingly, there is no method that allows you to retrieve a single direct message. One might expect something like this to work:

https://twitter.com/statuses/show/id.xml
https://twitter.com/direct-messages/show/id.xml

Attempting such a request, however, results in the following response from the API:

<html><body>You are being <a href="https://twitter.com/direct_messages">
  redirected</a>.</body></html>

That means if you want to get information on a particular message, you have to either page through the full list until you find it, or delete it and get the single direct-message data object one last time.

Note

The lack of functionality in this part of the API may be one reason why more applications haven’t developed tools to manage direct messages.

Member Account

One area where third-party development has helped change the API is in making changes to a member’s profile. Twitter added the Update Member Location method, for example, after a need for it materialized with the iPhone: developers wanted to be able to leverage the geocode information to pinpoint a user’s location at any given moment, and a method was needed to facilitate this update. The creation of that method changed the value of that open text field, in turn, since it now reflected where certain users were at a particular moment in time, rather than where they chose to call home.

This section covers all the functions available through the Twitter API to help manage a Twitter member’s account.

Update Member Location

For a long time, one of the few profile changes you could make through the API was to set your location. Prior to this method being added to the arsenal, the only way to change this text was to log into the Twitter website and manually edit your account settings there. That was before the iPhone, BrightKite, and other location-aware systems started integrating with Twitter.

The Update Member Location method allows you to change the location field in the authenticated user’s profile through the API:

https://twitter.com/account/update_location.xml

This method has one required parameter, location, which contains the new text you want to place on your user profile, replacing whatever is currently in the location field.

Note

At the time of this writing, the location is simply text. Twitter makes no attempt to normalize the content or turn everything into geocodes or latitude and longitude values. The applications that use this method likely pass valid geocode data, but there is nothing to validate or verify it in the API.

Omitting the location parameter will result in the current location being reset to an empty string, rather than generating an error. The short-form user profile data object returned by the API indicates success.

Warning

This method has been deprecated in favor of the Update Member Profile method, which contains an optional parameter for location. To keep your application from breaking in the future, you should avoid using update_location.

Update Member Profile

Many other bits of information are displayed on the member profile pages on the Twitter website. This method allows you to change some of that information through the API:

https://twitter.com/account/update_profile.xml

All of the parameters for this method are related to fields found under the “Account” tab of the Settings page on the Twitter website. Only specified parameters will result in updates to the profile. A user object is returned with the new information to indicate success (see User Objects).

Note

You are supposed to include at least one parameter. However, even if you fail to do so, you will still get a user object, making the request act like an authenticated users/show method.

This method allows for several parameters, including location (to update the user’s location, which used to be handled through the deprecated method discussed in the preceding section). You can also adjust the full name of the account holder (name), the URL pointing to a company or personal web page (url), the email address associated with the account (email), and the short text statement about the account holder (description). Length limits of between 40 and 160 characters are imposed, depending on the parameter.

Warning

In addition to being valid, the email address must also be unique. You cannot assign an email address to an account if it is already being used by another account.

Update Profile Colors

Although the number of tweets posted from the Twitter website has decreased to under 50% of the full archive, most active users visit the main website on a regular (perhaps even daily) basis to check out the profiles of new followers. Not surprisingly, an eye for design is on the rise. The trend began in 2008 with a Photoshop template that was virally distributed to extend the profile information by using a background image. Later that year, Twitter added themes to allow members to differentiate their web pages’ appearances, even if they didn’t want to go as far as designing a custom look. Every day, the visuals matter a little more.

The Update Profile Colors method lets you make changes to the color scheme used on your member page. The five available parameters are the same ones you would find under the “Design” tab on the Settings page of the main website. The “change design colors” link at the bottom of that page reveals fields to control the color of the profile background (profile_background_color), text (profile_text_color), links (profile_link_color), sidebar shading (profile_sidebar_fill_color), and border (profile_sidebar_border_color). This method accepts parameters to change the values of each of those fields:

https://twitter.com/account/update_profile_colors.xml

All values have to be sent as hexadecimal, either the short (fff) or long (ffffff) form. Because this method deals with changing things on the Twitter server, it requires a POST and is not charged against your rate limit. The method returns a user object with the new information (see User Objects).

Warning

Use the hexadecimal values only, not the hash (#) used on the website. If you include the hash, you will get an “Invalid hex color” error.

Changing the background color will not necessarily lead to that color being seen. If the account already has a background image in place, the web page will use that instead; the background image has to be removed before a change to the background color will be visible.

Note

Twitter themes take precedence when you are editing the appearance of the profile page. Even if the API shows the account with white text, the web page may still show the theme’s text color in the web form. This behavior goes away when the background image is removed.

Update Background Image

One of the early e-business ventures dealing with advertising on Twitter was Twittads . This company allows individuals to set a price for how much their Twitter profiles are worth and sell ads to display as their background images for a given period of time. This method can be used to automate that process, switching the advertising for you to make sure it gets done.

This method accepts a required image parameter to set the authenticating member’s theme with a new background graphic:

https://twitter.com/account/update_profile_background_image.xml

The image parameter expects the raw multipart data as its value, not simply a URL linking to an existing file—even though that is what you get from Twitter in the user object XML that is returned (see User Objects). The graphic can be in GIF, JPG, or PNG format. The background image can have a maximum of 2,048 pixels and the file must be smaller than 800 KB. The file size is a firm limit, but larger pixel sizes will be scaled down to fit.

Note

Some initial problems with GIF formats were reported in Twitter Developers Talk discussions. You may get better results by using PNG and specifying the mime type.

The background image on a member’s profile page can be displayed as a fixed image attached to the upper-left corner of the browser window, or it can be tiled, which means the image is repeated again and again to fill the entire window. You currently have to go back to the website to check a box for tiling, and unless you replace it with another file through the API, visiting the site is also the only way to remove a background image. At the time of this writing, Twitter doesn’t provide a way through the API to either remove a background picture or have the site tile it.

Note

It may be a good programming strategy for your application if you keep a local copy of the default image used by Twitter, as well as check the existing profile configurations. That way, you can easily use the API to restore the profile to its previous state.

In late 2008, Twitter Patterns—a site by designer Natalie Jost of olivemanna—launched with a gallery of high-quality downloadable background images for users to upload to Twitter. Although as of this writing no tools are available to create custom themes, it isn’t much of a leap to believe that these newest API methods that facilitate changing the design of Twitter profile pages will lead to support for such tools in 2009.

Note

This book’s sample applications don’t cover interacting with the image methods (that coverage may come in the next edition), but here is a quick command-line request using cURL that might give you a hint about how to make the update_profile_background_image and update_profile_image methods work:

curl -k -F 'image=@filename.jpg;type=image/jpeg' -u 
yourusername:yourpassword -H 'Expect:' 
https://twitter.com/account/update_profile_image.xml

The important part here is -F section, which allows the raw data to be sucked out of a graphics file (@filename.jpg) and POSTed to the Twitter method. For more information about using cURL from the command line, see Play Along at Home.

Update Profile Image

Twitter also provides a method to change your profile image (the picture shown with your tweets). As is the case with the Update Background Image method, this method requires an image parameter:

https://twitter.com/account/update_profile_image.xml

The profile picture is usually quite a bit smaller than the background image for the member profile page, and thus it has smaller maximum values: the GIF, JPG, or PNG file can have a maximum of 500 pixels and must be smaller than 700 KB. Larger pixel dimensions will be scaled down, but bigger files are rejected. Twitter typically stores three versions of every profile picture: the full-sized picture uploaded by the user; the big version used on the profile page (with a _bigger suffix); and the regular version (with a _normal suffix) for display with each tweet.

Most errors are trapped and explained with a vague message: “There was a problem with your picture. Probably too big.” (See Hash Objects.) Look for a status code of 200 to let you know whether you were successful. You can also check the website; the image changes are immediate, if successful.

Note

Even before another Twitter member reads your status update, chances are he’s looking at your Twitter avatar. This visual reminder of who you are is helpful when your readers are scanning through dozens of tweets at a time looking for something you wrote. The ability to create new programs to manage changes to this important part of your profile is a huge advancement for developers and for the people who will use their new tools. In my opinion, this particular addition to the Twitter API has the greatest potential to have an impact on Twitter culture and behavior.

Update the Delivery Device

Another configuration setting handled through the API is where you want status updates from the people you follow to be sent:

https://twitter.com/account/update_delivery_device.xml

This method requires one parameter, device, which can contain one of only three possible options: once set, the authenticated user will be configured to receive status updates as text messages on a cell phone (sms), via a chat client (im), or neither (none).

Note

In 2008, Twitter officially discontinued support of IM, which had been disabled for several months prior to the decision. It is not expected back in the foreseeable future, rendering the im value meaningless.

Routing your Twitter information stream through your mobile phone is an acquired taste. For some people, this is the only interface they use. For others, phone notifications provide a way to filter down the larger following list to just a handful of closely watched friends (as described in the next section).

The API response for a successful edit of the delivery device is the short-form profile data object for the authenticating user (see User Objects). If you omit the required parameters or pass along a value that isn’t one of the approved devices (say, device=cow), an error will be returned: “You must specify a parameter named ‘device’ with a value of one of: sms, im, none” (see Hash Objects).

Turn On/Off Notification

If you decide you do want your information streamed to your cell phone, Twitter lets you customize which members of your following list are included:

https://twitter.com/notifications/follow/id.xml

This method will enable the notification setting for the individual user identified in id and will start sending that user’s status updates to the device set with the Update the Delivery Device method.

Note

Obviously, if the device setting is not sms, these individual notification settings are ignored.

Success is once again signified with XML containing the specified user’s short-form profile. If you attempt to add notifications for a user you do not follow, or one for whom you have already enabled them, an error will be reported: “There was a problem following the specified user” (see Hash Objects).

The evil twin to this method is Turn Off Notification, which disables notifications for the user identified in the request URL:

https://twitter.com/notifications/leave/id.xml

Device notification toggles can cover receipt of direct messages, too. There are two criteria for receiving direct messages from a specific user on your phone: you must have included that user in your notifications list by turning on notification, and direct messages must be selected in the Devices tab in the Settings section of the Twitter website. This latter control is buried in a pull-down menu that appears only after you successfully register your cell phone number with your Twitter account.

Note

Enabling and disabling notifications for a user is not the same as following or unfollowing that user. Rather, by setting up notifications you can essentially designate a subgroup of the larger list of people you follow (unless you include everyone) whose content you specifically want delivered to your phone as SMS messages.

API Administration

Yes, sending updates and messages to your many followers is fun, but sometimes you need to deal with “the meta,” or the info about the info. This section covers all the administrative and technical functions available through the Twitter API to help manage the application you are trying to debug.

Test

Sometimes coding is a nightmare. Things don’t work, and you don’t know why. When that happens, it helps to step back and check whether all your equipment is properly connected. Twitter helps you do this with a simple API method to ping the system:

https://twitter.com/help/test.xml

This method does nothing except return a simple string—an appropriate HTTP status code along with a Boolean (see Response Objects).

If you get a status code of 200, the problem is likely buried somewhere in your code. If you get an HTTP error, stop deconstructing your programming skills and figure out what’s keeping you from the API.

Verify Credentials

One of the more helpful methods to incorporate into your application is the credentials check:

https://twitter.com/account/verify_credentials.xml

This method returns a 200 status code if the username and password you plan to use with other methods is valid. The XML response is a user object for the authenticating user (see User Objects).

Note

This is a prime example of why checking the HTTP status code is a good idea. At the end of 2008, Twitter changed the output from a simple response object (<authorized>true</authorized>) to a full user profile. If your code was looking for that text at the time, the program probably started acting like the account didn’t verify.

If your authentication doesn’t work, for whatever reason, Twitter will let you know with an error message: “Could not authenticate you” (see Hash Objects).

Although you could figure out whether your username and password work simply by trying any method that requires authentication, this method is the preferred way of verifying credentials. It has less overhead than most of the other methods and is easier to use in the code logic.

Check Rate Limit Status

One of the other obstacles your application may encounter—especially if you expect to perform many data requests in rapid succession—is the error that says you have been cut off:

Rate limit exceeded. Clients may not make more than
    100 requests per hour

When that happens, you are at the mercy of the clock. The API won’t give you the things you want until the hour is up, leaving you and your users hanging.

Twitter created a method specifically to check on the status of your rate limit, providing a great tool to help you anticipate the problem and manage how your application will deal with the bad news:

https://twitter.com/account/rate_limit_status.xml

This method returns a short data object filled with numbers and dates (see Hash Objects) that will tell you how long you have to wait until you can start making API requests again. A common coding strategy is to check the rate limit status once at the beginning of the application run, and then keep a counter going to let PHP know when to stop bugging the API for data. This approach enables you to gracefully pause or halt the application before bumping into the error from Twitter.

This method doesn’t require authentication, but even if you do authenticate the request it does not count against the rate limit for the account you are checking. If you don’t provide user credentials, it returns the tally for the requesting IP address.

Note

Remember, yours is not likely to be the only application making use of a given user account. Clients such as Twitterrific, TweetDeck, and Twhirl function by hitting the API with each individual account.

It is good practice to follow a credentials check (Verify Credentials) with an initial rate limit check. If the account is already spent for the hour, you can gracefully exit the program and ask the user to try again later. Otherwise, you can count down the remaining hits to the API and let the user know, once the limit is hit, that the processing will take a while.

Note

Heavy request applications can use a direct message to send a new URL to the user as the last action when a long-running job has finished processing.

End a Member Session

For applications that manage user sessions—which are typical of publishing clients—Twitter created a method to clean up the access and make sure the authenticated user is logged out:

https://twitter.com/account/end_session.xml

Success is reported with the following XML message: “Logged out” (see Hash Objects).

Search

One of the more successful third-party development projects to make use of the Twitter API was Summize. This company, which began collecting data in spring 2008, created a search engine specifically for exploring the millions of tweets passing through Twitter each week. Summize also created its own API, which in turn spawned a few interesting new applications that tried to make sense of the Twitter content.

Summize was so good at what it did that Twitter acquired the company a few months after its launch. Now, Summize’s search API is being integrated into the original Twitter API as part of an overhaul of the system. At present, the Twitter search API is still effectively separate from the original Twitter API, as it was created under different rules. There are a number of nuanced differences that should go away with the next version of the Twitter API.

Note

The methods described in this book are expected to continue to function for at minimum six months after the release of the next version of the API, sometime in mid-2009.

The Twitter search API does not support XML. For the purposes of this book, we will use Atom as a substitute format.

This section deals with the Twitter Keyword Search method and the many optional parameters used to shape the results. We’ll also take a look at the Monitor Trends method.

Keyword Search

Ignoring for the moment the means by which you make the request, searching with the API is very similar to conducting any search through a website. You need to know what keywords you want to find and how you want the results returned to you.

Note

Twitter doesn’t limit use of the search API. It does monitor for abuse, however, and reacts to unusual activity on a case-by-case basis. If you encounter errors you can’t explain, contact Twitter to investigate.

The request method for tweet searches uses a slightly different URL, with the required q parameter added to the end as an encoded string:

http://search.twitter.com/search.atom?q=query

That q parameter is very powerful. Depending on what you put into it, it can be a great filter for the tweet corpus. Each query string has to be encoded for travel in the GET request.

Content searches

The query string can contain multiple keywords and symbols that are matched against tweet content. The most general of these is a standard keyword search. For example, if I want to find all of the many, many fans of the Chicago Bears during a big game, I can search for “Go Bears”:

http://search.twitter.com/search.atom?q=Go+Bears

The query string can also look for the special formatting patterns Twitter uses to interpret tweets as replies. Any authors referencing the user “aboy” will likely use the convention “@aboy” in the body of the tweet content, and all other messages can be filtered out with this query:

http://search.twitter.com/search.atom?q=%40aboy

Note

Even though the servers have no problem doing so, URL encoding of special characters can be difficult for humans to read. Here are a few symbols used in Twitter searches, with their encoded forms:

@ is encoded as %40
# is encoded as %23
? is encoded as %3F
: is encoded as %3A

There is no special handling of a hash (#) in the way that @ signifies a reply, but Twitter does recognize the tagging convention in its advanced searches. The hashtag (#) is a convention imported from IRC and other Twitter ancestors to group content with other messages on the same topic. Its whole reason for being is to facilitate searching, tying similar content together. When the Bears again return to the Super Bowl, I can create a stream of tweets by searching for #superbowl:

http://search.twitter.com/search.atom?q=%23superbowl

It is certain to be a very long list...next year.

Note

#Hashtags tend to make #reading #tweets difficult and consume precious characters. Fortunately for my eyes, not everyone uses them. There is no #meta #channel, however, that would allow #you to #tag your tweets. #necessaryevil #hate #rant

Another common structure of interest to searchers is whether or not the tweet contains a question. Twitter will scan tweets for the presence of a ?, returning only status updates that contain that character. For example, this search looks for instances of Twitter members questioning existence:

http://search.twitter.com/search.atom?q=existence+%3F

One of the more innovative searches is for emoticons, such as :) and :(, that are often included in status updates to reflect the author’s mood. Including emoticons in the search request causes the API to return an interesting collection of statements that share an emotional state:

http://search.twitter.com/search.atom?q=sometimes+%3A)
http://search.twitter.com/search.atom?q=sometimes+%3A(

Meta filters

Content searches like these look only at what is in the tweet, not at the meta descriptions of the status updates. The advanced search in Twitter interprets several special formats of the query string to examine the network, location, composition, and creation date.

Every Twitter author updates her status from some location in the world. This reported location may be very accurate—smart phones can update geographic information as the user travels—or it may just be wherever the user initially set for her account location. A location is assigned to every tweet and can be leveraged in web search with the near: and within: filters to pinpoint tweets originating from a particular area:

http://search.twitter.com/search?q=slouching+near%3Abethlehem
http://search.twitter.com/search?q=pizza+near%3ABloomington%2C+Indiana+within%3A1mi

The Search API no longer recognizes these location filters, requiring the geocode parameter instead. Twitter searches also easily identify tweets containing links, which make up a sizeable chunk of the corpus. Sometimes the published links are intentional, placed there purposely to direct readers to some interesting resource on the Web. In many cases, they are added automatically as part of integration with some other system, such as BrightKite. Using filter:links, you can filter a keyword search to return only the tweets containing links:

http://search.twitter.com/search.atom?q=fluffy+kitten+filter%3Alinks

The time and date a link was published are also search fodder. You’ve already seen how to use the since: filter as a parameter (since) in the original Twitter API methods. This filter identifies a specific point in time and asks the search API to return only matching tweets published after that date. Although there is no matching Twitter API parameter, searches can also look on the other side of that date using the until: filter. That will cause only the relevant tweets published before the specified date to be returned:

http://search.twitter.com/search.atom?q=Fail+Whale+since%3A2008-09-01
http://search.twitter.com/search.atom?q=olympics+until%3A2008-08-08

Note

The date filters accept the yyyy-mm-dd format.

Twitter makes an attempt to thread some tweets together using the @ reply convention. By virtue of this network of replies, the average twitterer can be connected to many individual tweets. You can get a list of those tweets by using from: and to: to identify a username:

http://search.twitter.com/search.atom?q=from%3Ahere
http://search.twitter.com/search.atom?q=to%3Aeternity

Note

Replies retrieved by a from:username search will not include any such user references occurring later in the tweet. Twitter recognizes the @username convention as signaling a reply only if it appears at the start of the message. For example, “@tilla rampages across the plain” is a reply, but “You are so rampaging across the plain, @tilla” is not.

This behavior will likely change as more people make use of the in_reply_to_status_id field Twitter added to the API in 2008.

Operators

Twitter searches accept more traditional operators, too. The default for a multikeyword search is to look for tweets containing instances of all the keywords listed, but the keywords don’t have to appear next to each other or in the order given. This probably isn’t a problem if the keywords are unusual, but for more common words the proximity may be important. Fortunately, it’s also possible to specify an exact phrase to search for. These two searches return different results because of the exclusion and inclusion of wrapping quotes:

http://search.twitter.com/search.atom?q=Fail+Whale
http://search.twitter.com/search.atom?q="Fail+Whale"

To be more inclusive and look for tweets containing either “Fail” or “Whale,” include the Boolean operator OR between the keywords as part of the query string. This will expand on the first search and probably include tweets about failing a test and tweets about Greenpeace:

http://search.twitter.com/search.atom?q=Fail+OR+Whale

Sometimes a search word is too ambiguous; the tweets returned may match the same word, but used in different contexts. If for some reason you wanted to search for status updates involving bears, but not the great professional football team from Chicago, you could do so by specifying which additional terms to ignore in combination with “bears.” You would do this by using the - sign as a prefix to the offending words:

http://search.twitter.com/search.atom?q=Bears+-Chicago

Details on all of these content searches, meta filters, and operators can be found in the advanced search interface on the Twitter Search site.

Incidentally, if the q parameter is empty, the search API will still return content—albeit just the header information (see Search Objects).

Optional parameters

In addition to the query string, the Keyword Search method accepts a number of optional parameters to help refine the search or adjust the way the results are returned. Some of these will be familiar to you from previous methods or from our examination of query string variations, and others will be new.

The standard Twitter parameters for navigation (page) and for pruning tweets to include only the most recent ones (since_id) operate in the same way as described earlier for other methods. The page size, however, is set by the rpp parameter, which indicates how many tweets should be returned per page. The maximum number allowed is 100, and the default is 16. You can change the size of this parameter depending on whether you want to display matches (smaller pages) or mine the corpus and save the results into a database (bigger pages).

Of growing use and importance is the geocode parameter. As iPhones propagate and location-centric tools grow in popularity, searches of Twitter content by location will increase. This optional parameter accepts specified values of latitude, longitude, and the radius of interest (in mi or km), and returns only the tweets written by authors whose locations match:

http://search.twitter.com/search.atom?geocode=31.12345%2C-88.98765%2C1km

This can no longer be done with the near: filter described earlier for web search. The Search API requires the geocode parameter to indicate location. It is often easier to post these search options as separate parameters rather than dealing with encoded multiparameter query strings.

The lang parameter accepts standard ISO 639-1 codes to indicate which languages should be included in the search results. This can be helpful, for example, to conduct English (en) or Spanish (es) language searches separately.

The content itself can be adjusted, too. When set to true, the show_user parameter will add username: to the start of every tweet. The reason for doing this is primarily a need to display who published what when, giving these results to the end user.

Note

There is also a callback option intended for JSON development with JavaScript programming. The callback is a development feature that allows better integration with client-side applications. Since this book doesn’t work with JSON, we won’t go into details; just know that the search API does accommodate JavaScript developers.

When the search API returns results in the Atom format, the list of entries is wrapped in a feed object that provides information about the conditions of the search, including when it was done and links to duplicate it. Details about each matched tweet are encapsulated in an entry element (see Search Objects).

Monitor Trends

Finally, the Twitter API includes a method that returns the top 10 descriptive keywords that are currently being used on Twitter. The response includes the time of the request, the name of each trending topic, and the URL to the Twitter search results page for that topic. Currently, the only supported format for this method is JSON. The callback parameter is supported, however.

The request URL for this method is:

http://search.twitter.com/trends.json

Note

Twitter continues to evolve. New methods will be added to the API in the future, some of which will be created as replacements for some of the functions discussed in this chapter. Since an API provider’s work is never done, these methods will continue to evolve. It is a good idea to routinely check the Twitter API wiki for new information. In the next chapter, we’ll take a closer look at the information that comes from making these requests to the API.

Other Data Options

There are a few other ways to get data out of Twitter, when the data flow coming out of the regular API is not fast enough.

The first option is to use the data mining feed, established specifically to help academics collect a large chunk of data more quickly than was possible with the View the Public Timeline method (see The Information Stream). The data mining feed caches 600 tweets every minute, expanding on what the public feed already provides.

There is no cost involved, but access to the feed is available only upon request. Researchers have to first provide a description of their projects and specify the IP addresses (you can submit more than one) where the API requests will originate. This information is sent in an email to api@twitter.com.

The shortcoming of the data mining feed is that it just provides more data, not a complete list of everything being published. For that, Twitter will begin offering the “firehose,” a stream of all public status updates on Twitter. Starting in 2009, this HTTP-push option will be made available to a select group of trusted projects, to test how it affects the stability of the service. Partners will be added on a case-by-case basis. Eventually, the firehose should support streams generated by a group of users; for example, you could connect to get near real-time tweets published by local members living in your hometown.

Note

Neither of these options—the data mining feed or the firehose—include the private tweets published by authors with protected accounts. Authentication and permission from the author (in the form of acceptance of your follow request) is the only way to get this protected data out of the API.

One more way to get high-volume tweet data is through Gnip , a paid service that powers data streams for sites like MyBlogLog and Plaxo on the consumer end. Gnip is a pinging service, which means it contacts your application whenever the data you are interested in shows up in the stream. Twitter provides its public stream to Gnip, which means you can pay Gnip to get access to that data.

Gone Phishing: A Word About Passwords

One weakness of the Twitter API is its reliance on individual usernames and passwords to be able to access the gooey caramel center of Twitter data. In the winter of 2008–2009, a couple of incidents generated a lot of conversation about how authentication is used with the API.

In November 2008, Twitterank launched, jumping into a sea of existing Twitter applications. Like some other projects at the time, its purpose was to try to quantify Twitter members, this time by applying the PageRank algorithm (the Twitterank developer worked at Google). Twitterank’s reach spread quickly in the form of self-promoting tweets published by the web application as each new member looked up her ratings. It didn’t do anything more than show you a number, though (your Twitter rank); there was no explanation, attribution, or much effort put into inspiring confidence through design.

The backlash was quick and angry. People started complaining about the viral posts, not realizing they had granted permission to tweet by default. Fueled by Oliver Marks’s article about user gullibility on ZDNet (see Twitterank), a witch hunt began, 140 characters at a time. Passwords were changed. The programmer responded^[68] by devoting much of the next few days to quickly iterating the site to regain some credibility. In the end, Twitterank proved not to be a phishing site; it was just a fumbled launch to another Twitter development project.

Fast-forward about two months to the start of the new year. One day after launch, another startup application, twply , was quickly auctioned on SitePoint for about $1,200. This came about after some 800 people had willingly entered their usernames and passwords into the new site, which promised to turn @username replies into emails.^[69] Within days, a phishing scam propagated through the Twitosphere in the form of direct messages suggesting that friends follow a link to a site. This site was a replica of the Twitter login page, inviting users to submit their Twitter access information. Many speculated that this was all related, although the DNS address of the Twitter phishing site was registered to a location in China.^[70] Throw in a separate incident involving hacking of some celebrity accounts, and the Twitter community was a bit shaken.

Blogger Louis Gray posted a great reflection about the Twitterank panic, speculating on some worst-case scenarios for a successful phishing expedition on Twitter:

The downsides of somebody hacking into my Twitter account and getting my credentials are low to begin with. In theory, if my account were compromised, they could Tweet on my behalf and make me look like a fool for some time, until I managed to get to Twitter support. In the meantime, you’d be sure to hear about it, and I assume others would be vocal in my favor. Another concern would be if you or I used the same login and password combination on other services. The perpetrator could then guess your ID on other services, or even access your financial records or anything else sensitive. But again, given the other Twitter developers’ comments in regards to OAuth, I tend to believe this is something the coders are working around, and I don’t think this is a mass account grab.^[71]

I bring up all of this because it is vital that you, as a budding Twitter developer, be aware of both the cultural and technical implications of asking for someone’s Twitter credentials to use your application.

Due to the way the API was developed during its first few years, there is sometimes a functional need to have individual users provide their usernames and passwords. Authentication in the API is currently used for two purposes:

To impose rate limits on a given account (something that is also handled through IP addresses, in the absence of authentication to some of the more open methods)
To grant access to data that is otherwise not open to the public, namely changes to the content or settings of the account and viewing private member status updates

Although the twishing (Twitter phishing) meme lit a fire under the debate, the fact remains that hundreds of applications built using the API have had to deal with this issue of users providing their credentials (sometimes referred to as an “anti-pattern”). This isn’t something new, nor does it have to be crippling. However, what’s at stake does need to be understood.

Is Twishing Worth the Effort?

The list of things you can do with someone’s Twitter account information seems long. In practice, however, it is quite limited. Most of the content you can see when you log into the website is already available through the API without authenticating, and the things an imposter might change are mostly harmless. A hacker with stolen authentication could certainly annoy, and assuming the darkest intent, there are some security holes that might be attractive.

Warning

Several of the changes that can be made to your account can be done through the Twitter API, at the same time that your username and password are being compromised. That means that a strategy of changing your password immediately after testing a new web application won’t necessarily prevent damage from being done. It’s always a matter of trust.

The things one can do to a compromised Twitter account fall into a number of categories: I’ll call them Controlling, Invading, Stealthing, Screaming, Damaging, Annoying, and Deceiving. In the lists in the following sections, I’ve indicated the functionality that is available only through the Twitter website, not through the API.

Controlling

The most obvious thing a hacker can do is deny you access to your own Twitter account. There are failsafes, however, in the form of Twitter’s great customer service and a historical willingness to help people reclaim their identities. Twitter keeps backups of your data, so even if you are blocked out for a day, there is nothing irreparable about it. Also, Twitter allows only one user account per email address, making this an unattractive target.

Malicious actions in this category include:

Changing your Twitter password [web-only]
Changing the email address associated with your account (the same address cannot be used for another account)
Warning
A caveat: the reality is that many people use the same username and password for all of their various logins. If you do this, a phisher could use your Twitter account details to access whatever funds are in your bank account. It’s always a good idea to use very different credentials for each of your more sensitive online accounts.

Invading

Some information isn’t available without your Twitter authentication. For example, if your account is compromised, people who protect their accounts and trust you enough to allow you to follow what they post will be vulnerable to the imposter coming in and capturing their tweets.

Note

This might be a good argument to eliminate the “protected” account option altogether. Technically, nothing shared is ever private.

Among the account settings, the most appealing bit of information is probably the phone number you use for your mobile updates. This could be a personal number that might not be widely distributed or even published elsewhere. A hacker could also access all of the private messages you have exchanged with others in your group. If you aren’t in the practice of deleting them, those messages might contain sensitive information about you or those with whom you converse.

Malicious actions in this category include:

Viewing direct messages you’ve sent and received
Viewing the timelines of protected account holders you follow
Viewing the mobile phone number you use for updates (if configured) [web-only]

Stealthing

Other changes a hacker could make are hidden from view. In mass, compromised accounts could be scanned for blocks on a spammer account, which could then be removed to force the users to follow that account. Not many people use favorites on Twitter, nor do these bookmarks have much value yet for third-party developers. However, if that changes, a hacker could target a few preferred tweets with marketing links and make sure they appear on everyone’s favorites list.

These actions are difficult to monitor, especially since they can all take place through API interactions, without the account holder knowing what is happening.

Actions in this category include:

Removing a block on another member
Favoriting a tweet
Removing a favorited tweet

Screaming

Of course, some attacks can be quite visible, such as the January 2009 hacks that posted prank status updates to the timelines of a couple of dozen Twitter celebrities.^[72]

Less obvious but equally visible would be gaining access to the compromised member’s profile page to upload a new avatar or background image. The hacker could also make changes to the web link or description stored in the member profile, or even change your profile picture (although that is much more likely to be noticed, and thus quickly corrected). Unless you visit your own profile web page regularly, you might not notice other changes to your page.

Malicious actions in this category include:

Changing user profile info, such as your description, web link, and full name
Posting a status update
Uploading a different avatar picture
Mucking with the theme design, color scheme, or background image of your Twitter profile web page
Changing your location

Damaging

An authenticated user can remove content. Deleting a recent public tweet may or may not be noticed, but odds are much greater that direct messages and old tweets won’t be reviewed as frequently. There isn’t much incentive for a phisher to do this—it’s the electronic equivalent of knocking down someone’s mailbox as you drive by. Presumably, Twitter could restore everything to an earlier state and fix the damage.

What might be more difficult to restore is damage to a reputation. For example, a specific member could be targeted with a massive blocking campaign using a number of stolen accounts, causing some reputation problems or damaging relationships.

Malicious actions in this category include:

Deleting a published tweet
Blocking another member
Deleting an existing direct message
Changing your follow network by following or unfollowing people

Annoying

If you targeted a specific user, an action such as turning off device notifications—or scheduling them to only show up at 2 a.m.—might be disruptive enough to cost that person sleep or make him miss a big meeting. Most of the changes in this category, however, are merely annoyances that would probably damage Twitter’s reputation more than the individual.

Malicious actions in this category include:

Toggling on or off the private status of the account [web-only]
Changing the device used for notifications
Changing whether direct messages are the only messages sent to the device [web-only]
Changing the sleep settings that temporarily disable device notifications during certain hours (e.g., at night while you’re sleeping) [web-only]
Adjusting the notification settings, which include when you get email, what kinds of replies you see, and whether you want to be “nudged” after a period of inactivity [web-only]

Deceiving

If there is one truly scary thing about a compromised Twitter account, it probably isn’t to do with posting to the public timeline or messing with account settings. The biggest weapon available to a hacker in a compromised account is trust.

The New Year’s phishing problems in early 2009 are a prime example of what can happen when someone communicates with your friends through direct messages. The probability that someone you know will click on a link because they think the recommendation is coming from you is quite high. A direct message is dripping with trust by definition, since the victim has chosen to follow you, and direct messages are comparatively rare and therefore stand out.

There’s only one malicious action in this category, but it’s a biggie:

Sending a direct message

OAuth Can Help

The API team has been working for a while on ways to improve the authentication scheme used to access Twitter data. The protocol talked about most often is OAuth, a way to secure API authorization with simple methods called by desktop and web applications.

Note

Twitter is expected to fully implement and support OAuth by the middle of 2009. The current system (HTTP Basic Auth) will go away in 2010.

OAuth is the product of a collaboration between former Twitter architect Blaine Cook (@blaine) and Chris Messina (@chrismessina), who spent the first part of 2007 trying to implement OpenID in the Twitter API. Others got involved as the group moved on to a review of industry authentication practices in systems such as Flickr, Amazon, and AOL. Finding no standard, they collaborated to write one. OAuth Core 1.0 was drafted in October 2007.

OAuth is a safer way for users to give applications access to the data and tools accessible via an API. User passwords don’t have to be spread around the Web, and developers can request only the access they need. This creates greater transparency about what each web application might do, and it also leads to less recklessness by users now in a position to be deliberate with the permissions they share.

The best analogy for this authentication scheme is the valet key:

Many luxury cars today come with a valet key. It is a special key you give the parking attendant and unlike your regular key, will not allow the car to drive more than a mile or two. Some valet keys will not open the trunk, while others will block access to your onboard cell phone address book. Regardless of what restrictions the valet key imposes, the idea is very clever. You give someone limited access to your car with a special key, while using your regular key to unlock everything.^[73]

The user has control over that key, granting access on Twitter to the other third-party sites requesting specific kinds of allowable actions. With OAuth, you don’t share your identity; you only share some of your content or details. For more information about OAuth, visit http://oauth.net.

What OAuth won’t do is protect people from themselves. It isn’t a cure-all. Phishing wasn’t invented on Twitter in 2009, after all. A phishing attack is an intentional, widespread attack that uses trust and carelessness as its weapons. I don’t want to downplay Twitter’s role in its phishing issues, but the worst thing that happened to people’s Twitter accounts was that some messages were delivered on their behalf by impersonators. Human nature did the rest.

That said, the popularity of Twitter web applications, coupled with a growing acceptance that giving away your account credentials is normal practice, makes phishing on Twitter that much easier. We are being trained to share our access details, and ultimately that is what disturbs some people most.

Warning

OAuth will not stop phishing attacks. Attempts to gain usernames and passwords through deception are problems that plague any popular system, from Facebook to PayPal to your local bank. OAuth can give access away to malicious systems; it just does so with more granularity over which parts of the account someone else can access.

Twitter will help improve the situation by taking steps to reduce the need for such authentication. In fact, that’s already happening. Third-party developers can help, too, by working their code away from situations where they need to ask for users’ passwords. Developers can also help the greater Twitter community by assuming some responsibility for the quality of the tools they release into the world.

Note

The suite of sample applications described in this book does ask for passwords. The code is meant to be “instructional, not productional.” As you use it to help you build your own web applications, keep these issues of security and trust in mind.

^[62]The private beta of Twitter’s implementation of OAuth launched in early 2009. It is expected to be tested throughout the first half of the year and incorporated into the next release of the API. The current HTTP Basic Auth will be deprecated six months after OAuth becomes fully supported.

^[63]From the February 12, 2009 blog article, “Never Share Your Twitter Password Again,” published on the Inuda blog (http://blog.inuda.com/2009/02/12/never-share-your-twitter-password-again/).

^[64]Abraham Williams’ sample PHP code can be downloaded at http://github.com/poseurtech/twitteroauth.

^[65]From a December 19, 2008 tweet (https://twitter.com/al3x/status/1068021673).

^[66]Bruno Peeters estimated in 2008 that 10–15% of all accounts were protected, based on his work tracking TwitDir membership numbers (http://twitterfacts.blogspot.com/2008/03/1-million-twitter-users.html), but growth in the Twitter membership base is diluting those estimates. My own research suggests that figure is between 3 and 9%.

^[67]On November 11, 2008, Blair Bends used this security hole in the API to track down the identity of the person who had posted status update 1,000,000,000. Nathan Reed had launched a countdown site several weeks prior to that date, anticipating the moment when Twitter served up ID number one billion. Although this almost certainly was not the actual billionth tweet, the milestone did attract some attention in the Twitter community. For more information on this milestone, read my blog account at http://www.blogschmog.net/2008/11/11/a-billion-served/.

^[68]From the November 13, 2008 blog article “Some follow up...”, by Ryo Chijiiwa, published on the Twitterank blog (http://twitterank.wordpress.com/2008/11/13/some-follow-up/).

^[69]From the January 9, 2009 blog article “The Curious Case of Twitter and Twply,” by Joshua Porter, published on Bokardo (http://bokardo.com/archives/the-curious-case-of-twply-and-twitter/).

^[70]From the January 3, 2009 blog article “Phishing Scam Spreading on Twitter,” by Chris Pirillo (http://chris.pirillo.com/phishing-scam-spreading-on-twitter/).

^[71]From the November 12, 2008 blog article “Twitterank Can Have My Password, No Questions Asked,” by Louis Gray (http://www.louisgray.com/live/2008/11/twitterank-can-have-my-password-no.html).

^[72]This particular hack was not the result of phishing. The celebrity accounts were compromised thanks to a Twitter employee’s poorly chosen password, cracked in a dictionary attack. More information can be found in the January 6, 2009 blog article “Weak Password Brings ‘Happiness’ to Twitter Hacker” by Kim Zetter, published in Wired (http://blog.wired.com/27bstroke6/2009/01/professed-twitt.html).

^[73]From “What is it for?” on http://oauth.net/about.

Get Twitter API: Up and Running now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial