Not all sites have their own search engines, and even the ones that do are sometimes difficult to use. Complicated or incomplete search engines are more pain than gain when attempting to search through archives of published articles. If you follow a couple of rules, Google is handy for finding back issues of published resources.
The trick is to use a common phrase to find the information you’re looking for. Let’s use the New York Times as an example.
Your first intuition when searching for previously published articles
might be to simply use
site:nytimes.com in your
Google query. For example, if I wanted to find articles on George
Bush, why not use:
"george bush" site:nytimes.com
This will indeed find you all articles mentioning George Bush published on NYTimes.com. What it won’t find is all the articles produced by the New York Times but republished elsewhere.
While doing research, keep credibility firmly in mind. If you’re doing casual research, maybe you don’t need to double-check a story to make sure it actually comes from the New York Times, but if you’re researching a term paper, double-check the veracity of every article you find that isn’t actually on the New York Times site.
What you actually want is a clear identifier, no matter the site of origin, that an article comes from the New York Times. Copyright disclaimers are perfect for the job. A New York Times copyright notice typically reads:
Copyright 2001 The New York Times Company
Of course, this would only find articles from 2001. A simple workaround is to replace the year with a Google full-word wildcard [Hack #13]:
Copyright * The New York Times Company
Let’s try that George Bush search again, this time
using the snippet of copyright disclaimer instead of the
"Copyright * The New York Times Company" "George Bush"
At this writing, you get over three times as many results for this search as for the earlier attempt.
Scientific American, Inc. All rights reserved.
(The date appears before the disclaimer, so I just dropped it to avoid having to bother with wildcards.)
Using that disclaimer as a quote-delimited phrase along with a search
hologram, for example—yields the
hologram "Scientific American, Inc. All rights reserved."
At this writing, you’ll get one result, which seems
like a small number for a general query like
hologram. When you get fewer results than
you’d expect, fall back on using the
site: syntax to go back to the originating site
In this example, you’ll find several results that you can grab from Google’s cache but are no longer available on the Scientific American site.
Most publications that I’ve come across have some kind of common text string that you can use when searching Google for its archives. Usually it’s a copyright disclaimer and most often it’s at the bottom of a page. Use Google to search for that string and whatever query words you’re interested in, and if that doesn’t work, fall back on searching for the query string and domain name.