Date-Range Searching

An undocumented but powerful feature of Google’s search and API is the ability to search within a particular date range.

Before delving into the actual use of date-range searching, there are a few things you should understand. The first is this: a date-range search has nothing to do with the creation date of the content and everything to do with the indexing date of the content. If I create a page on March 8, 1999, and Google doesn’t get around to indexing it until May 22, 2002, for the purposes of a date-range search, the date in question is May 22, 2002.

The second thing is that Google can index pages several times, and each time it does so the date on it changes. So don’t count on a date-range search staying consistent from day to day. The daterange: timestamp can change when a page is indexed more than one time. Whether it does change depends on whether the content of the page has changed.

Third, Google doesn’t “stand behind” the results of a search done using the date-range syntaxes. So if you get a weird result, you can’t complain to them. Google would rather you use the date-range options on their advanced search page, but that page allows you to restrict your options only to the last three months, six months, or year.

The daterange: Syntax

Why would you want to search by daterange:? There are several reasons:

  • It narrows down your search results to fresher content. Google might find some obscure, out-of-the-way page and index it only once. Two years later this obscure, never-updated page is still turning up in your search results. Limiting your search to a more recent date range will result in only the most current of matches.

  • It helps you dodge current events. Say John Doe sets a world record for eating hot dogs and immediately afterward rescues a baby from a burning building. Less than a week after that happens, Google’s search results are going to be filled with John Doe. If you’re searching for information on (another) John Doe, babies, or burning buildings, you’ll scarcely be able to get rid of him.

    However, you can avoid Mr. Doe’s exploits by setting the date-range syntax to before the hot dog contest. This also works well for avoiding recent, heavily covered news events such as a crime spree or a forest fire and annual events of at least national importance such as national elections or the Olympics.

  • It allows you to compare results over time; for example, if you want to search for occurrences of “Mac OS X” and “Windows XP” over time.

    Of course, a count like this isn’t foolproof; indexing dates change over time. But generally it works well enough that you can spot trends.

Using the daterange: syntax is as simple as:

daterange:startdate-enddate

The catch is that the date must be expressed as a Julian date, a continuous count of days since noon UTC on January 1, 4713 BC. So, for example, July 8, 2002 is Julian date 2452463.5 and May 22, 1968 is 2439998.5. Furthermore, Google isn’t fond of decimals in its daterange: queries; use only integers: 2452463 or 2452464 (depending on whether you prefer to round up or down) in the previous example.

Tip

There are plenty of places you can convert Julian dates online. We’ve found a couple of nice converters at the U.S. Naval Observatory Astronomical Applications Department (http://aa.usno.navy.mil/data/docs/JulianDate.html) and Mauro Orlandini’s home page (http://www.tesre.bo.cnr.it/~mauro/JD/), the latter converting either Julian to Gregorian or vice versa. More may be found via a Google search for julian date (http://www.google.com/search?hl=en&lr=&ie=ISO-8859-1&q=julian+date).

You can use the daterange: syntax with most other Google special syntaxes, with the exception of the link: syntax, which doesn’t mix [Hack #8] well with other special syntaxes [Section 1.5] and the Google’s Special Collections [Chapter 2] (e.g., stocks: and phonebook:).

daterange: does wonders for narrowing your search results. Let’s look at a couple of examples. Geri Halliwell left the Spice Girls around May 27, 1998. If you wanted to get a lot of information about the breakup, you could try doing a date search in a ten-day window—Say, May 25 to June 4. That query would look like this:

"Geri Halliwell" "Spice Girls" daterange:2450958-2450968

At this writing, you’ll get about two dozen results, including several news stories about the breakup. If you wanted to find less formal sources, search for Geri or Ginger Spice instead of Geri Halliwell.

That example’s a bit on the silly side, but you get the idea. Any event that you can clearly divide into before and after dates—an event, a death, an overwhelming change in circumstances—can be reflected in a date-range search.

You can also use an individual event’s date to change the results of a larger search. For example, former ImClone CEO Sam Waksal was arrested on June 12, 2002. You don’t have to search for the name Sam Waskal to get a very narrow set of results for June 13, 2002:

imclone daterange:2452439-2452439

Similarly, if you search for imclone before the date of 2452439, you’ll get very different results. And as an interesting exercise, try a search that reflects the arrest, only date it a few days before the actual arrest:

imclone investigated daterange:2452000-2452435

This is a good way to find information or analysis that predates the actual event, but that provides background that might help explain the event itself. (Unless you use the date-range search, usually this kind of information is buried underneath news of the event itself.)

But what about narrowing your search results based on content creation date?

Searching by Content Creation Date

Searching for materials based on content creation is difficult. There’s no standard date format (score one for Julian dates), many people don’t date their pages anyway, some pages don’t contain date information in their header, and still other content management systems routinely stamp pages with today’s date, confusing things still further.

We can offer few suggestions for searching by content creation date. Try adding a string of common date formats to your query. If you wanted something from May 2003, for example, you could try appending:

("May * 2003" | "May 2003" | 05/03 | 05/*/03)

A query like that uses up most of your ten-query limit, however, so it’s best to be judicious—perhaps by cycling through these formats one a time. If any one of these is giving you too many results, try restricting your search to the title tag of the page.

If you’re feeling really lucky you can search for a full date, like May 9, 2003. Your decision then is if you want to search for the date in the format above or as one of many variations: 9 May 2003, 9/5/2003, 9 May 03, and so forth. Exact-date searching will severely limit your results and shouldn’t be used except as a last-ditch option.

When using date-range searching, you’ll have to be flexible in your thinking, more general in your search than you otherwise would be (because the date-range search will narrow your results down a lot), and persistent in your queries because different dates and date ranges will yield very different results. But you’ll be rewarded with smaller result sets that are focused on very specific events and topics.

Get Google Hacks now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.