What combinations of search syntaxes will and will not fly in your Google search?
There was a time when you couldn’t “mix” Google’s special syntaxes [Section 1.5]—you were limited to one per query. And while Google released ever more powerful special syntaxes, not being able to combine them for their composite power stunted many a search.
This has since changed. While there remain some syntaxes that you just can’t mix, there are plenty to combine in clever and powerful ways. A thoughtful combination can do wonders to narrow a search.
The syntaxes that request special information—stocks: [Hack #18],
bphonebook:, and phonebook: [Hack #17]—are all antisocial syntaxes.
You can’t mix them and expect to get a reasonable
The other antisocial syntax is the
link: syntax shows you which pages have a link
to a specified URL. Wouldn’t it be great if you
could specify what domains you wanted the pages to be from? Sorry,
you can’t. The
link: syntax does
For example, say you want to find out what pages link to
O’Reilly & Associates, but you
don’t want to include pages from the
.edu domain. The query
will not work, because the
doesn’t mix with anything else. Well,
that’s not quite correct. You will get results, but
they’ll be for the phrase “link
www.oreilly.com” from domains that are not
If you want to search for links and exclude the domain
.edu, you have a couple of options. First, you
can scrape the list of results [Hack #44]
and sort it in a spreadsheet to remove
.edu domain results. If you want to try it
through Google, however, there’s no command that
will absolutely work. This one’s a good one to try:
inanchor:oreilly -inurl:oreilly -site:edu
This search looks for the word O’Reilly in anchor
text, the text that’s used to define links. It
excludes those pages that contain O’Reilly in the
search result (e.g., oreilly.com). And, finally, it excludes those
pages that come from the
But this type of search is nowhere approaching complete. It only
finds those links to O’Reilly that include the
string oreilly—if someone creates a link like
Book</a>, it won’t be found
by the query above. Furthermore, there are other domains that contain
the string oreilly, and possibly domains that link to oreilly that
contain the string oreilly but aren’t oreilly.com.
You could alter the string slightly, to omit the oreilly.com site
itself, but not other sites containing the string oreilly:
inanchor:oreilly -site:oreilly.com -site:edu
But you’d still be including many O’Reilly sites that aren’t at O’Reilly.com.
So what does mix? Pretty much everything else, but there’s a right way and a wrong way to do it.
Don’t overuse single syntaxes, as in:
While you might think you’re asking for results from either
.edusites, what you’re actually saying is that site results should come from both simultaneously. Obviously a single result can come from only one domain. Take the example
perl site:edu site:com. This search will get you exactly zero results. Why? Because a result page cannot come from a
.edudomain and a
.comdomain at the same time. If you want results from
.comdomains only, rephrase your search like this:
perl (site:edu | site:com)
allintitle:when mixing syntaxes. It takes a careful hand not to misuse these in a mixed search. Instead, stick to
intitle:. If you don’t put
allinurl:in exactly the right place, you’ll create odd search results. Let’s look at this example:
At first glance it looks like you’re searching for the string “perl” in the result URL, and the word “programming” in the title. And you’re right, this will work fine. But what happens if you move
allinurl:to the right of the query?
This won’t get any results. Stick to
intitle:, which are much more forgiving of where you put them in a query.
Don’t use so many syntaxes that you get too narrow, like:
title:agriculture site:ucla.edu inurl:search
You might find that it’s too narrow to give you any useful results. If you’re trying to find something that’s so specific that you think you’ll need a narrow query, start by building a little bit of the query at a time. Say you want to find plant databases at UCLA. Instead of starting with the query:
title:plants site:ucla.edu inurl:database
Try something simpler:
databases plants site:ucla.edu
and then try adding syntaxes to keywords you’ve already established in your search results:
intitle:plants databases site:ucla.edu
intitle:database plants site:ucla.edu
For example, say you want to get an idea of what databases are offered by the state of Texas. Run this search:
intitle:search intitle:records site:tx.us
You’ll find 32 very targeted results. And of course, you can narrow down your search even more by adding keywords:
birth intitle:search intitle:records site:tx.us
It doesn’t seem to matter if you put plain keywords at the beginning or the end of the search query; I put them at the beginning, because they’re easier to keep up with.
site: syntax, unlike site syntaxes on other search
engines, allows you to get as general as a domain suffix (site:com)
or as specific as a domain or subdomain
site:thomas.loc.gov). So if
you’re looking for records in El Paso, you can use
and you’ll get seven results.
Sometimes you’ll want to find a certain type of information, but you don’t want to narrow by type. Instead, you want to narrow by theme of information—say you want help or a search engine. That’s when you need to search in the URL.
inurl: syntax will search for a string in the URL but
won’t count finding it within a larger URL. So, for
example, if you search for
will not find pages from
but it would find pages from http://www.research-councils.ac.uk.
This takes you to a manageable 162 results. The whole point is to get
a number of results that finds you what you need but
isn’t so large as to be overwhelming. If you find
162 results overwhelming, you can easily add the
site: syntax to the search and limit your results
to university sites:
intitle:biology inurl:help site:edu
But beware of using so many special syntaxes, as I mentioned above, that you detail yourself into no results at all.
It’s possible that I could write down every possible syntax-mixing combination and briefly explain how they might be useful, but if I did that, I’d have no room for the rest of the hacks in this book.
Experiment. Experiment a lot. Keep in mind constantly that most of these syntaxes do not stand alone, and you can get more done by combining them than by using them one at a time.
Depending on what kind of research you do, different patterns will
emerge over time. You may discover that focusing on only PDF
filetype:pdf) finds you the results you
need. You may discover that you should concentrate on specific file
types in specific domains (
site:tompeters.com). Mix up the syntaxes as many
ways as is relevant to your research and see what you get.