There was a time when you couldn’t mix Google’s special syntax elements; you were limited to one per query. Even as Google released ever more powerful special syntax elements, not being able to combine them for their composite power stunted many a search.
This has since changed. While there remain some syntax elements that you just can’t mix, there are plenty to combine in clever and powerful ways. A thoughtful combination can do wonders to narrow a search.
There are some simple rules to follow when mixing syntax elements. These, for the most part, revolve around how not to mix.
Don’t mix syntax elements that will cancel each other out, such as:
site:ucla.edu -inurl:ucla
Here you’re saying you want all results to come from ucla.edu, but that site results should not have the string “ucla” in the results. Obviously, that’s not going to produce many URLs.
Don’t overuse single syntax elements, as in:
site:com site:edu
While you might think you’re asking for results from either .com or .edu sites, what you’re actually saying is that site results should come from both simultaneously. Obviously, a single result can come from only one domain. Take the example
perl
site:edu
site:com
. This search will get you exactly zero results. Why? Because a result page cannot come from a .edu domain and a .com domain at the same time. If you want results from .edu and .com domains only, rephrase your search like this:perl (site:edu | site:com)
With the pipe character (
|
), you’re specifying that you want results to come either from the .edu or the .com domain.Don’t use
allinurl
: orallintitle
: when mixing syntax. It takes a careful hand not to misuse these in a mixed search. Instead, stick toinurl
: orintitle
:. If you don’t putallinurl
: in exactly the right place, you’ll create odd search results. Let’s look at this example:allinurl:perl intitle:programming
At first glance, it looks like you’re searching for the string “perl” in the result URL and the word “programming” in the title. And you’re right: this will work fine. But what happens if you move
allinurl
: to the right of the query?intitle:programming allinurl:perl
This won’t bring any results. Stick to
inurl
: andintitle
:, which are much more forgiving of where you put them in a query.The same advice goes for
allintext
: andallinanchor
:.Don’t use so much syntax that you get too narrow, like:
title:agriculture site:ucla.edu inurl:search
You might find that it’s too narrow to give you any useful results. If you’re trying to find something so specific that you think you’ll need a narrow query, start by building a little bit of the query at a time. Say you want to find plant databases at UCLA. Instead of starting with the query:
title:plants site:ucla.edu inurl:database
Try something simpler:
databases plants site:ucla.edu
and then try adding syntax to keywords you’ve already established in your search results:
intitle:plants databases site:ucla.edu
or:
intitle:database plants site:ucla.edu
If you’re trying to narrow down search results, the
intitle
: and site
: syntax
elements are your best bet.
For example, say you want to get an idea of what databases are offered by the state of Texas. Run this search:
intitle:search intitle:records site:tx.us
You’ll find something on the order of 30 very targeted results. And, of course, you can narrow down your search even more by adding keywords:
birth intitle:search intitle:records site:tx.us
It doesn’t seem to matter whether you put plain keywords at the beginning or the end of the search query; I put them at the beginning because they’re easier to keep up with.
The site
: syntax, unlike site syntax on other
search engines, allows you to get as general as a domain suffix
(site:com
) or as specific as a domain or subdomain
(site:thomas.loc.gov
). So if
you’re looking for records in El Paso, you can use
this query:
intitle:records site:el-paso.tx.us
and you’ll get approximately one result.
Sometimes you want to find a certain type of information, but you don’t want to narrow by type. Instead, you want to narrow by theme of information (e.g., you want help or a search engine). That’s when you need to search in the URL.
The inurl
: syntax will search for a string in the
URL but won’t count finding it within a larger word.
So, for example, if you search for inurl:research
,
Google will not find pages from http://www.researchbuzz.com, but it will find pages
from www.research-councils.ac.uk.
Say you want to find information on neurosurgery, with an emphasis on learning or assistance. Try:
intitle:neurosurgery inurl:help
This returns a more manageable 880 or so results. The whole point is
to get a number of results that finds what you need but
isn’t so large as to be overwhelming. If you find
880 results overwhelming, you can easily mix the
site
: syntax into the search and limit your
results to universities:
intitle:neurosurgery inurl:help site:edu
Beware, however, of using too much special syntax. As mentioned earlier, you can quickly detail yourself into no results at all.
The antisocial syntax elements are the ones that won’t mix and should be used individually for maximum effect. If you try to use them with other syntax elements, you won’t get any results.
The syntax elements that request special
information—stocks
:,
rphonebook
:,
bphonebook
:,
and phonebook:
are all antisocial. That is, you can’t mix them and
expect to get a reasonable result.
The other antisocial syntax is link
:, which shows
pages that have a link to a specified URL. Wouldn’t
it be great if you could specify what domains you want the pages to
be from? Sorry, you can’t. The
link
: syntax
does not mix with anything else—not even plain old keywords.
For example, say you want to find out what pages link to
O’Reilly Media, Inc., but you don’t
want to include pages from the .edu domain. The
query link:www.oreilly.com -site:edu
will not work
because the link
: syntax does not work in
combination. Well, that’s not quite correct; you
will get results, but they’ll be for the phrase
“link:www.oreilly.com” from domains
that are not .edu.
If you want to search for links and exclude the .edu domain, there’s no single command that will absolutely work. This one’s a good try, though:
inanchor:oreilly -inurl:oreilly -site:edu
This search looks for the word “oreilly” in anchor text, the text that’s used to define links; excludes pages that contain “oreilly” in the search result (e.g., oreilly.com); and, finally, excludes those pages that come from the .edu domain.
But this type of search is nowhere near complete. It finds only those
links to O’Reilly that include the string
“oreilly”: if someone creates a
link such as <a
href="http://perl.oreilly.com/">Camel
Book</a>
, it won’t be found
by the preceding query. Furthermore, there are other domains that
contain the string “oreilly,” and
there may be domains that link to
“oreilly” that contain the string
“oreilly” but
aren’t oreilly.com. You could
alter the string slightly, to omit the oreilly.com
site itself but not other sites containing the string
“oreilly”:
inanchor:oreilly -site:oreilly.com -site:edu
However, you would still be including many O’Reilly sites—XML.com and MacDevCenter.com, for instance—that aren’t at oreilly.com.
While it is possible to write down every syntax-mixing combination and briefly explain how they might be useful, there wouldn’t be room for much else in this book.
Experiment. Experiment a lot. Constantly keep in mind that most of these syntax elements do not stand alone, and you can get more done by combining them than by using them individually
Depending on what kind of research you are doing, different patterns
will emerge over time. For example, you may discover that focusing on
only PDF documents
(filetype:pdf
) finds you the results you need. You
may discover that you should concentrate on specific file types in
specific domains (filetype:ppt
site:tompeters.com
). Mix up the syntax in as many
ways as is relevant to your research and see what you get.
As with anything else, the more you use Google’s special syntax, the more natural it will become to you. And Google is constantly adding more, much to the delight of regular web combers.
If, however, you want something more structured and visual than a single query line, Google’s Advanced Search should fit the bill.
Get Google Hacks, 2nd Edition now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.