In addition to the basic AND
,
OR
, and phrase searches, Google offers some rather
extensive special syntax for narrowing your searches.
As a full-text search engine, Google indexes entire web pages instead of just titles and descriptions. Additional commands, called special syntax or advanced operators, let Google users search specific parts of web pages for specific types of information. This comes in handy when you’re dealing with more than eight billion web pages and need every opportunity to narrow your search results. Specifying that your query words must appear only in the title or URL of a returned web page is a great way to have your results get very specific without making your keywords themselves too specific. Following are descriptions of the special syntax elements, ordered by common usage and function.
Tip
Some of these syntax elements work well in combination. Others fare not quite as well. Still others do not work at all. For detailed discussion on what does and does not mix, see “Mixing Syntax,” below.
-
intitle
: intitle
: restricts your search to the titles of web pages. The variationallintitle
: finds pages wherein all the words specified appear in the title of the web page. Usingallintitle
: is basically the same as using theintitle
: before each keyword.intitle:"george bush" allintitle:"money supply" economics
You may wish to avoid the
allintitle
: variation, because it doesn’t mix well with some of the other syntax elements.-
intext
: intext
: searches only body text (i.e., ignores link text, URLs, and titles). While its uses are limited, it’s perfect for finding query words that might be too common in URLs or link titles.intext:"yahoo.com" intext:html
There’s an
allintext
: variation, but again, this doesn’t play well with others.-
inanchor
: inanchor
: searches for text in a page’s link anchors. A link anchor is the descriptive text of a link. For example, the link anchor in the HTML code<a
href="http://www.oreilly.com">O'Reilly
Media</a>
is “O’Reilly Media.”inanchor:"tom peters"
As with other
in*
: syntax elements, there’s anallinanchor
: variation, which works in a similar way (i.e., all the keywords specified must appear in a page’s link anchors).-
site
: site
: allows you to narrow your search by either a site or a top-level domain. The AltaVista search engine, by contrast, has two syntax elements for this function (host
: anddomain
:), but Google has only the one.site:loc.gov site:thomas.loc.gov site:edu site:nc.us
Be aware that
site
: is no good for trying to search for a page that exists beneath the main or default site (i.e., in a subdirectory such as /~sam/album/). For example, if you’re looking for something below the main GeoCities site, you can’t usesite
: to find all the pages in http://www.geocities.com/Heartland/Meadows/6485/; Google returns no results. Useinurl
: instead.-
inurl
: inurl
: restricts your search to the URLs of web pages. This syntax tends to work well for finding search and help pages, because they tend to be rather regular in composition. Anallinurl
: variation finds all the words listed in a URL but doesn’t mix well with some other special syntax.inurl:help allinurl:search help
You’ll see that using the
inurl
: query instead of thesite
: query has one immediate advantage: you can use it to search subdirectories.Tip
While the
http://
prefix in a URL is ignored by Google when used withsite
:, search results come up short when including it in aninurl
: query. Be sure to remove prefixes in anyinurl
: query for the best (read: any) results.You can also use
inurl
: in combination with thesite
: syntax to draw out information on subdomains. For example, how many subdomains doesoreilly.com
really have? A quick query will help you figure that out:site:oreilly.com -inurl:www.oreilly.com
This query asks Google to list all pages from the
oreilly.com
domain, but leave out those pages which are from the common subdomainwww
, since you already know about that one.-
link
: link
: returns a list of pages linking to the specified URL. Enterlink:www.google.com
and you’ll get a list of pages that link to the Google home page, http://www.google.com (not anywhere in the google.com domain). Don’t worry about including the http:// bit; you don’t need it and, indeed, Google appears to ignore it even if you do put it in.link
: works just as well with “deep” URLs—http://www.raelity.org/apps/blosxom/, for instance—as with top-level URLs such as raelity.org.-
cache
: cache
: finds a copy of the page that Google indexed even if that page is no longer available at its original URL or has since changed its content completely.cache:www.yahoo.com
If Google returns a result that appears to have little to do with your query, you’re almost sure to find what you’re looking for in the latest cached version of the page at Google.
The Google cache is particularly useful for retrieving a previous version of a page that changes often.
-
daterange
: daterange
: limits your search to a particular date or range of dates on which a page was indexed. It’s important to note that adaterange
: search has nothing to do with when a page was created, but when it was indexed by Google. So a page created on February 2 but not indexed by Google until April 11 would turn up in adaterange
: search for April 11."Geri Halliwell" "Spice Girls" daterange:2450958-2450968
For an in-depth treatment of finding content either by the date it was created or when it was first noticed by Google, see [Hack #16] .
-
filetype
: filetype
: searches the suffixes or filename extensions. These are usually, but not necessarily, different file types;filetype:htm
andfiletype:html
will give you different result counts, even though they’re the same file type. You can even search for different page generators—such as ASP, PHP, CGI, and so forth—presuming the site isn’t hiding them behind redirection and proxying. Google indexes several different Microsoft formats, including PowerPoint (.ppt
), Excel (.xls
), and Word (.doc
).homeschooling filetype:pdf "leading economic indicators" filetype:ppt
-
related
: related
:, as you might expect, finds pages that are related to the specified page. This is a good way to find categories of pages; a search forrelated:google.com
returns a variety of search engines, including Lycos, Yahoo!, and Northern Light.related:www.yahoo.com related:www.cnn.com
While an increasingly rare occurrence, you’ll find that not all pages are related to other pages.
-
info
: info
: provides a page of links to more information about a specified URL. This information includes a link to the URL’s cache, a list of pages that link to the URL, pages that are related to the URL, and pages that contain the URL.info:www.oreilly.com info:www.nytimes.com/technology
Note that this information is dependent on whether Google has indexed the specified URL; if not, information will obviously be far more limited.
-
phonebook
: phonebook
:, as you might expect, looks up phone numbers.phonebook:John Doe CA phonebook:(510) 555-1212
The phonebook is covered in detail in [Hack #6] .
Get Google Hacks, 2nd Edition now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.