BUY THIS BOOK
Add to Cart

Print Book $24.99


Add to Cart

Print+PDF $32.49

Add to Cart

PDF $19.99

Safari Books Online

What is this?

Add to UK Cart

Print Book £17.50

What is this?

Looking to Reprint or License this content?


Google Hacks
Google Hacks, Third Edition Tips & Tools for Finding and Using the World's Information By Rael Dornfest, Paul Bausch, Tara Calishain
August 2006
Pages: 543

Cover | Table of Contents


Table of Contents

Chapter 1: Web
Google’s front page is deceptively simple: a search form and a couple of buttons. Yet that basic interface—so alluring in its simplicity—belies the power of the Google engine underneath and the wealth of information at its disposal. If you use Google’s search syntax to its fullest, the Web is your oyster.
Searching in Google doesn’t have to be a case of just entering what you’re looking for in the search box and hoping for the best. Google offers you many ways—via special syntax and search options—to refine your search criteria and help Google better understand what you’re looking for. We’ll dig into Google’s powerful, all-but-undocumented special syntax and search options, and show how to use them to their fullest. We’ll cover the basics of Google searching, wildcards, word limits, syntax for special cases, mixing syntax elements, advanced search techniques, and using specialized vocabularies, including slang and jargon.
Whenever you search for more than one keyword at a time, a search engine has a default strategy for handling and combining those keywords. Can those words appear individually anywhere in a page, or do they have to be right next to each other? Will the engine search for both keywords or for either keyword?
Google defaults to searching for occurrences of your specified keywords anywhere in the page, whether side by side or scattered throughout. To return the results of pages containing specifically ordered words, enclose them in quotes, turning your keyword search into a phrase search , to use Google’s terminology.
On entering a search for the keywords:
to be or not to be
Google will find matches where the keywords appear anywhere on the page. If you want Google to find you matches where the keywords appear together as a phrase, surround them with quotes, like this:
"to be or not to be"
Google will return matches in which only those words appear together (not to mention explicitly including stop words such as “to” and “or”; see the section “Explicit Inclusion” a little later).
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Google Web Search Basics
Whenever you search for more than one keyword at a time, a search engine has a default strategy for handling and combining those keywords. Can those words appear individually anywhere in a page, or do they have to be right next to each other? Will the engine search for both keywords or for either keyword?
Google defaults to searching for occurrences of your specified keywords anywhere in the page, whether side by side or scattered throughout. To return the results of pages containing specifically ordered words, enclose them in quotes, turning your keyword search into a phrase search , to use Google’s terminology.
On entering a search for the keywords:
to be or not to be
Google will find matches where the keywords appear anywhere on the page. If you want Google to find you matches where the keywords appear together as a phrase, surround them with quotes, like this:
"to be or not to be"
Google will return matches in which only those words appear together (not to mention explicitly including stop words such as “to” and “or”; see the section “Explicit Inclusion” a little later).
Phrase searches are also useful when you want to find a phrase but aren’t quite sure of the exact wording. This is accomplished in combination with wildcards, explained later in the chapter in “Full-Word Wildcards.”
Whether an engine searches for all keywords or any of them depends on what is called its Boolean default . Search engines can default to Boolean AND (searching for all keywords) or Boolean OR (searching for any keywords). Of course, even if a search engine defaults to searching for all keywords, you can usually give it a special command to instruct it to search for any keyword. Lacking specific instructions, the engine falls back on its default setting.
Google’s Boolean default is AND, which means that if you enter query words without modifiers, Google will search for all your query words. For example, if you search for:
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Full-Word Wildcards
Some search engines support a technique called stemming, in which you add a wildcard character—usually * (asterisk) but sometimes ? (question mark)—to part of your query, requesting the search engine to return variants of that query using the wildcard as a placeholder for the rest of the word. For example, moon* would find moons, moonlight, moonshot, etc.
Google doesn’t support explicit stemming. It didn’t used to support stemming at all, but now it implicitly stems for you. So, canine dietary will yield results for dog diet, diets, and other variations on the theme.
Google does offer a full-word wildcard. While a wildcard can’t stand in for part of a word, you can insert a wildcard (Google’s wildcard character is *) into a phrase, and the wildcard will act as a substitute for one full word. Searching for three * mice, therefore, finds three blind mice, three blue mice, three green mice, etc.
What good is the full-word wildcard? It’s certainly not as useful as stemming, but then again, it’s not as confusing to the beginner. * is a stand-in for one word; ** signifies two words, and so on. The full-word wildcard comes in handy in the following situations:
  • Checking the frequency of certain phrases and derivatives of phrases, such as: intitle:"methinks the * doth protest too much" and intitle: "the * of Seville" (intitle: is described next in “Special Syntax”).
  • Filling in the blanks on a fitful memory. Perhaps you remember only a short string of song lyrics; search using only what you remember rather than randomly reconstructed full lines.
  • Let’s take as an example the disco anthem “Good Times” by Chic. Consider the following line: “You silly fool, you can’t change your fate.”
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Special Syntax
In addition to the basic AND, OR, and phrase searches, Google offers some rather extensive special syntax for narrowing your searches.
As a full-text search engine, Google indexes entire web pages instead of just titles and descriptions. Additional commands, called special syntax , or advanced operators, let Google users search specific parts of web pages for specific types of information. This comes in handy when you’re dealing with more than eight billion web pages and need every opportunity to narrow your search results. Specifying that your query words must appear only in the title or URL of a returned web page is a great way to specify your results without making your keywords themselves too specific. Following are descriptions of the special syntax elements, ordered by common usage and function.
Some of these syntax elements work well in combination. Others don’t fare quite as well. Still others do not work at all. For a detailed discussion of what does and does not mix, see “Mixing Syntax” later in this chapter.
intitle:
intitle: restricts your search to the titles of web pages. The variation allintitle: finds pages in which all the specified words appear in the title of the web page. Using allintitle: is basically the same as using intitle: before each keyword:
intitle:"george bush"
allintitle:"money supply" economics
You may wish to avoid the allintitle: variation because it doesn’t mix well with some of the other syntax elements.
intext:
intext: searches only body text (i.e., it ignores link text, URLs, and titles). While its uses are limited, it’s perfect for finding query words that might be too common in URLs or link titles:
intext:"yahoo.com"
intext:html
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Mixing Syntax
There was a time when you couldn’t mix Google’s special syntax elements; you were limited to one per query. Even as Google released ever more powerful special syntax elements, not being able to combine them for their composite power stunted many a search.
This has since changed. While there remain some syntax elements that you just can’t mix, there are plenty to combine in clever and powerful ways. A thoughtful combination can do wonders to narrow a search.
There are some simple rules to follow when mixing syntax elements. These, for the most part, revolve around how not to mix:
  • Don’t mix syntax elements that will cancel out each other, such as:
  • site:ucla.edu -inurl:ucla
  • Here, you’re saying you want all results to come from ucla.edu, but that site results should not have the string “ucla” in the results. Obviously, that’s not going to produce many URLs.
  • Don’t overuse single syntax elements, as in:
  • site:com site:edu
  • While you might think you’re asking for results from either .com or .edu sites, what you’re actually saying is that site results should come from both simultaneously. Obviously, a single result can come from only one domain. Take the example perl site:edu site:com. This search will get you exactly zero results. Why? Because a result page cannot come from a .edu domain and a .com domain at the same time. If you want results from .edu and .com domains only, rephrase your search like this:
  • perl (site:edu | site:com)
  • With the pipe character (|), you specify that you want results to come either from the .edu or the .com domain.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Advanced Search
Google’s default simple search allows you to do quite a bit, but not everything. Google’s Advanced Search page (http://www.google.com/advanced_search), shown in Figure 1-1, provides more options, such as date search and filtering, with “fill in the blank” searching options for those who don’t take naturally to memorizing special syntax.
Figure 1-1: Google’s Advanced Search page
Most of the options presented on this page are self-explanatory, but we’ll take a quick look at the kinds of searches that would be more difficult using the single-text-field interface of a simple search.
Because Google uses Boolean AND by default, it’s sometimes hard to logically build out the nuances of a particular query. Using the text boxes at the top of the Advanced Search page, you can specify words that must appear—exact phrases or lists of words, at least one of which must appear—and words to be excluded.
Using the Language pull-down menu, you can specify the language all returned pages must be in, from Arabic to Turkish.
The File Format option lets you include or exclude several different file formats, including Microsoft Word and Excel. A couple Adobe formats (most notably PDF) and Rich Text Format are options here, too. This is where the Advanced Search is at its most limited: there are literally dozens of file formats that Google can search for, and this set of options represents only a small subset. To get at the others, use the filetype: special syntax described earlier in “Special Syntax.”
Date allows you to specify search results updated in the last three, six, or twelve months. This date search is much more limited than the daterange: special syntax, which can give you results as narrow as one day, but Google stands behind the results generated using the Date option on the Advanced Search, while not officially sanctioning the use of the
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Quick Links
If you’re a Google regular, you’ve no doubt noticed those snippets of linked information proliferating near the top-left of the first results page (see Figure 1-2). Where once there was only a sponsored link or two between you and your results, now there are spelling suggestions, news headlines, stock quotes, and all other manner of bits and bobs of rather useful information.
Figure 1-2: Quick links augmenting search results with relevant, current, and local information
Google is going beyond web search results to include relevant finds from its other properties and those of third parties. Here, briefly, is the current catalog of quick links:
Spelling
One nice side effect of Google’s listening to the Web is that it picks up a lot of words along the way. Some appear in the dictionary, while others haven’t quite made their way into common parlance. Some are made up, while others are simply misspelled. Query Google for something that is commonly spelled another way, and it’ll proffer some suggestions. “Consult the Dictionary” delves further into the wonders of Google’s spell checker.
Definitions
TLAs (that’s “three-letter acronyms”) and geek speak abound. Rather than smiling knowingly when you’ve not a clue what someone just said, ask Google if it knows what your friend, boss, or medical professional is talking about. Prepend just about any word, obscure or garden-variety, with define (e.g., define happy), and the first item on your results page will in all probability be a definition pulled from one of any number of web dictionaries. Use define: (note the colon—e.g., define:osteichthyes) to pull up a whole page full of definitions [Hack #6].
News Headlines
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Language Tools
In the early days of the Web, it seemed like most web pages were in English. But as more and more countries have come online, materials have become available in a variety of languages—including languages that have not originated from a particular country (such as Esperanto and Klingon).
Google offers several language tools, including one for translation and one for Google’s interface. The interface option is much more extensive than the translation option, but the translation option has a lot to offer.
The language tools are available by clicking the Language Tools link on the front page or by going to http://www.google.com/language_tools.
The first tool allows you to search for materials from a certain country and/or in a certain language. This is an excellent way to narrow your searches; searching for French pages from Japan gives you far fewer results than searching for French pages from France. You can narrow the search further by searching for a slang word in another language. For example, search for the English slang word bonce on French pages from Japan.
The second tool on this page allows you to translate either a block of text or an entire web page from one language to another. Most of the translations are to or from English.
Machine translation is not nearly as good as human translation, so don’t rely on this translation as either the basis of a search or as a completely accurate translation of the page you’re looking at. Instead, use it to get the gist of whatever it translates.
You don’t have to come to this page to use the translation tools. When you enter a search, you’ll see that some search results that aren’t in your language of choice (which you set via Google’s preferences) have “[Translate this page]” next to their titles. Click on one of these and you’ll be presented with a framed, translated version of the page. The Google frame at the top allows you to view the original version of the page, as well as return to the results or view a copy suitable for printing.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Anatomy of a Search Result
You’d think a list of search results would be pretty straightforward, wouldn’t you—just a page title and a link, possibly a summary? Not so with Google. Google encompasses so many search properties and has so much data at its disposal that it fills every results page to the rafters. Within a typical search result, you can find sponsored links, ads, links to stock quotes, page sizes, spelling suggestions, and more.
By knowing more of the nitty-gritty details of what’s what in a search result, you’ll be able to make some guesses (“Wow, this page that links to my page is very large; perhaps it’s a link list”) and correct roadblocks (“I can’t find my search term on this page; I’ll check the version Google has cached”).
Let’s use the word “flowers” to examine this anatomy. Figure 1-3 shows the result page for flowers.
Figure 1-3: Results page for “flowers”
First, note that at the top of the page a selection of tabs allows you to repeat your search across other Google search categories besides web pages, including Google Images, Google Groups, Google News, Froogle, and Google Maps. Beneath that is a count of the number of results and how long the search took: about 524,000,000 results in 0.14 seconds (this will vary, sometimes by quite a bit).
Sometimes results/sites are called out on colored backgrounds at the top or right of the results page (see Figure 1-3). These are called sponsored links (read: advertisements). Google has a policy of very clearly distinguishing ads and sticking to text-based advertising only rather than throwing flashing banners in your face like other sites do.
You might also see Quick Links for some queries that Google thinks it has a direct answer for, but for the most part you’ll see a list of 10 results. The first real (i.e., nonsponsored) result of the search for flowers is shown in Figure 1-4.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Setting Preferences
Google’s Preferences page, shown in Figure 1-5, provides a nice, easy way to set and save your search preferences.
Figure 1-5: Google’s Preferences page
You can set your Interface Language, the language in which tips and messages are displayed.
Not to be confused with Interface Language, Search Language restricts the languages that are considered when searching Google’s page index. The default is any language, but you could be interested only in web pages written in Chinese and Japanese, or French, German, and Spanish—the combination is up to you.
Google’s SafeSearch filtering affords you a method of avoiding search results that may offend your sensibilities. No filtering means you’re offered anything in the Google index. Moderate filtering rules out explicit images, but not explicit language. Strict filtering filters both text and images. The default is moderate filtering.
By default, Google displays 10 results per page. For more results, click any of the Result Page: 1 2 3... links at the bottom of each result page, or simply click the Next link.
You can specify your preferred number of results per page (10, 20, 30, 50, or 100), along with whether you want results to open in the current window or a new browser window.
You can choose to open search results in a new browser window—handy for keeping your search results in place. If you’ve ever clicked from site to site only to find you’ve completely lost the page of results you’d like to return to, try enabling this option.
For the purpose of research, it’s best to have as many search results as possible on the page. Because it’s all text, it doesn’t take that much longer to load 100 results than it does to load 10. If you have a computer with a decent amount of memory, it’s also good to have search results open in a new window, which will keep you from losing your place and leave you a window with all the search results readily available.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Understanding Google URLs
If you’re like most people, you usually pay little attention to the URLs in your browser’s address bar as you surf from one site to the next. And you might choose to stick with this habit while searching Google. You ought to know, however, that a subtle alteration made to the URL that Google returns after a search can be an efficient method of tweaking your result set. In fact, there’s at least one thing you can do by fiddling with (we like to call it hacking) the URL that you can do no other way, and there are quick tricks that might save you a trip back to the Advanced Search page.
Say you want to search for three blind mice. The URL of the page of results will vary depending on the preferences you’ve set, but it will look something like this:
http://www.google.com/search?num=100&hl=en&q=%22three+blind+mice%22
The query itself—q=%22three+blind+mice%22, %22 being a URL-encoded " (double quote)—is pretty obvious, but let’s break down what those extra bits mean.
num=100 refers to the number of search results per page—100 in this case. Google accepts any number from 1 to 100. Altering the value of num is a nice shortcut to altering the preferred size of your result set without having to meander over to the Advanced Search page and rerun your search.
Don’t see the num= in your URL? Simply append it by clicking at the end of the URL in your browser’s address bar and typing it in. To set the number of results per page to 20, for instance, add &num=20.
You can add or alter any of the modifiers described here by appending them to the URL or changing their values—the part after the = (equals)—to something within the accepted range for the modifier in question. If you’re adding a modifier, you must use an & (ampersand) too. Look at how the modifiers are joined together on URLs for other search results to see how it’s done.
hl=en refers to the language interface (the language in which you use Google, reflected in the home page, messages, and buttons). Here, it’s in English. Google’s Language Tools [“Language Tools” earlier in this chapter] page provides a list of language choices. Run your mouse over each language choice and notice the change reflected in the URL. The URL for Pig Latin looks like this:
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Browse the Google Directory
Google has a searchable subject index in addition to its Web Search.
Google’s Web Search indexes billions of pages, which means it isn’t suitable for all searches. When you have a search that you can’t narrow down—for example, if you’re looking for information on a person about whom you know nothing—billions of pages will get very frustrating very quickly.
But you don’t have to limit your searches to the Web. Google also has a searchable subject index, the Google Directory, at http://directory.google.com. Instead of indexing the entirety of billions of pages, the directory describes sites instead, indexing about five million URLs. This makes it a much better search for general topics.
Does Google spend time building a searchable subject index in addition to a full-text index? No, Google bases its directory on the Open Directory Project data at http://dmoz.org/. Unlike the results at the standard Google Web Search, the collection of URLs at the Open Directory Project is gathered and maintained by a group of human volunteers rather than automatic algorithms, but Google does add some of its own Googlish magic to it.
As you can see in Figure 1-6, the front of the site is organized into several topics. To find what you’re looking for, you can either do a keyword search or drill down through the hierarchies of subjects.
Figure 1-6: The Google Directory
Beside most listings, as shown in Figure 1-7, you’ll see a green bar. The green bar is an approximate indicator of the site’s PageRank in the Google search engine. (Not every listing in the Google Directory has a corresponding PageRank in the Google web index.) Web sites are listed in the default order of Google PageRank, but you also have the option to list them in alphabetical order.
Figure 1-7: Individual listings under Science Physics Quantum Mechanics People Feynman, Richard
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Glean a Snapshot of Google in Time
Google Zeitgeist provides a weekly, monthly, and yearly overview of what the Web was interested in.
Turning to Google itself for a definition of zeitgeist (define:zeitgeist), there’s consensus that it refers to “the spirit of the times.” And Google Zeitgeist (http://www.google.com/press/zeitgeist.html) is just that: a mirror that the Web (according to Google) holds up to us, providing a snapshot of the week, month, or year that was.
A typical weekly Google Zeitgeist, shown in Figure 1-8, lists the top 15 gaining queries.
Figure 1-8: The week’s top 15 gaining queries
It takes only a few moments of visiting Google Zeitgeist before you’re itching to go back a little further in time: the week your second child was born, the month during which the Olympics were held, the year you graduated from high school. Click the Archive link to choose any year from the Google Zeitgeist Archive and display links such as those shown in Figure 1-9 for every week, month, and year since January 2001.
Weekly Zeitgeist updates actually started in June 2001, at the same time the monthlies switched from PDF to HTML format. In August 2005, Google stopped listing declining queries and started listing 5 more of the top gaining queries, bringing the total to 15.
Figure 1-9: The Zeitgest Archive pages, displaying weekly, monthly, and year-end reports dating back to 2001
Monthly reports provide some information about Google News queries and Google Image Search queries, and you can find monthly reports for countries around the world by clicking the Zeitgeist Around the World link on the front page. Year-end reports provide even more detail with trend graphs.
While Google Zeitgeist’s statistics aren’t earth-shattering (e.g., searches for
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Visualize Google Results
The TouchGraph Google Browser is the perfect Google complement for those who appreciate visual displays of information.
Some people are born text crawlers. They can retrieve the mostly text resources of the Internet and browse them happily for hours. But others are more visually oriented and find that the flat text results of the Internet leave something to be desired, especially when it comes to search results.
If you’re the type that appreciates visual displays of information, you’re bound to like the TouchGraph Google Browser (http://www.touchgraph.com/TGGoogleBrowser.html). This Java applet allows you to start with pages that are similar to one URL, and then expand outward to pages that are similar to the first set of pages, on and on, until you have a giant map of nodes (a.k.a. URLs) on your screen.
The TouchGraph Google Browser was created by Alex Shapiro (http://www.touchgraph.com/).
Note that you’re finding URLs that are similar to another URL, just as you would if you used the related: syntax. You aren’t doing a keyword search, and you’re not using the link: syntax. You’re searching by Google’s measure of similarity.
Start your journey by entering a URL on the TouchGraph home page and clicking the Graph It link. Your browser will launch the TouchGraph Java applet, covering your window with a large mass of linked nodes, as shown in Figure 1-10.
Figure 1-10: Mass of linked nodes generated by TouchGraph
You’ll need a web browser capable of running Java applets. If Java support in your preferred browser comes in the form of a plug-in, your browser should have the smarts to launch a plug-in locator/downloader and walk you through the installation process.
If you’re easily entertained like me, you might amuse yourself for a while just by clicking and dragging the nodes around. But there’s more to do than that.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Check Your Spelling
Google sometimes takes the liberty of “correcting” what it perceives to be a spelling error in your query.
Most of us couldn’t communicate with the outside world without a spellchecker. As you send off an email or put the finishing touches on a document, a trusty spellchecker makes sure you haven’t made any blatant errors. Google also has a built-in spellchecker, and when Google thinks it can spell individual words or complete phrases in your search query better than you can, it suggests a “better” search, hyperlinking it directly to a query.
For example, if you search for hydrecefallus, Google will ask if you meant hydrocephalus, as shown in Figure 1-13.
Figure 1-13: Offering spelling suggestions when Google thinks it knows better
Suggestions aside, Google assumes that you know of what you speak and returns your requested results, provided your query gleaned results.
If your query found no results for the spellings you provided and Google believes it knows better, it will automatically run a new search of its own suggestions. Thus, a search for hydrecefallus finding (hopefully) no results sparks a Google-initiated search for hydrocephalus.
Mind you, Google does not arbitrarily come up with its suggestions, but builds them based on its own database of words and phrases found while indexing the Web. If you search for nonsense like kweghgjdlsggaa, you’ll get no results and be offered no suggestions.
This is a lovely side effect and a quick and easy way to check the relative frequency of spellings. Query for a particular spelling, and note the number of results. Then click on Google’s suggested spelling and note the number of results. It’s surprising how close the counts are sometimes, indicating an oft-misspelled word or phrase.
If you find yourself turning to Google to compare spellings, you might want to automate the process of comparing phrases [Hack #26].
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Google Phonebook: Let Google’s Fingers Do the Walking
Google makes an excellent phonebook, even to the extent of doing reverse lookups.
Google combines residential and business phone number information and its own excellent interface to offer a phonebook lookup that provides listings for businesses and residences in the United States. However, the search offers three different syntaxes, different levels of information provide different results, the syntaxes are finicky, and Google doesn’t provide documentation.
Google offers three ways to search its phonebook:
phonebook
Searches the entire Google phonebook
rphonebook
Searches residential listings only
bphonebook
Searches business listings only
The result page for phonebook: lookups lists only five results for both residential and business numbers. The more specific rphonebook: and bphonebook: searches provide up to 30 results per page. For a better chance of finding what you’re looking for, use the appropriate targeted lookup.
Using a standard phonebook requires knowing quite a bit of information about what you’re looking for: first name, last name, city, and state. Google’s phonebook requires no more than last name and state to get started. Casting a wide net for all the Smiths in California is as simple as:
phonebook:smith ca
Try giving 411 a whirl with that request! Figure 1-14 shows the results of the query.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Look Up Definitions
Do you find yourself smiling knowingly when your boss mentions that well-known business principle you’ve never heard of? Overwhelmed with “geek speak”? Chances are Google’s heard it mentioned—and possibly even defined—somewhere before.
Most specialized vocabularies remain, for the most part, fairly static; words don’t suddenly change their meaning all that often. Not so with technical and computer-related jargon. It seems like every 12 seconds someone comes up with a new buzzword or term relating to computers or the Internet, and then 12 minutes later it becomes obsolete or means something completely different—often more than one thing at a time. Maybe it’s not that bad. It just feels that way.
Google can help you in two ways: by helping you look up words and by helping you figure out what words you don’t know but need to know.
Before you assume you’re going to be in for a lot of Googling, try the define search syntax mentioned in the “Quick Links” section earlier in this chapter. Simply prepend the definition you’re after with the special syntax keyword define, like so:
define google juice
define julienne
define 42
Google tells you that these are defined as “power of a website to turn up in Google,” “cut food into thin sticks,” and “being two more than forty,” thanks to Wikipedia, Low Carb Luxury, and WordNet at Princeton, respectively.
Click the associated “Definition in context” link to visit the page from which the definition was drawn.
Click the “Web definitions for...” link or prefix the word you’re defining with define: (note the addition of a colon) in the first place, and you’ll net a full page of definitions drawn from all manner of places. For instance, define:TLA finds turns up oodles of definitions (all about the same, mind you), as shown in Figure 1-15.
Figure 1-15: A page chock-full of definitions for TLA
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Find Directories of Information
Use Google to find directories, link lists, and other collections of information.
Sometimes you’re more interested in large information collections than scouring for specific bits and bobs. You could always take a stroll through the Google Directory (http://directory.google.com) to see what’s available, but sometimes a topic-specific directory is what you need.
Using Google, there are a couple of different ways to find directories, link lists, and other information collections from across the Web. The first uses Google’s full-word wildcards [“Full-Word Wildcards” earlier in this chapter] and the intitle: syntax [“Special Syntax” earlier in this chapter]. The second is a judicious use of particular keywords.
Pick something you’d like to find collections of information about. We’ll use “trees” as our example. The first thing we look for is any page with the words “directory” and “trees” in its title. In fact, we build in a little buffering for words that might appear between the two using a couple of full-word wildcards (* characters). The resultant query looks something like this:
intitle:"directory * * trees"
This query finds “directories of evergreen trees,” “South African trees,” and of course “directories containing simply trees.”
What if you want to take things up a notch, taxonomically speaking, and find directories of botanical information? Use a combination of intitle: and keywords, like so:
botany intitle:"directory of"
and you get almost 10,000 results. Changing the tenor of the information might be a matter of restricting results to those coming from academic institutions. Appending an edu site specification brings you to:
botany intitle:"directory of" site:edu
This gets you around 150 results, a mixture of resource directories, and, unsurprisingly, directories of university professors.
Mixing these syntaxes works rather well when searching for something that might also be an offline print resource. For example:
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Cover Your Bases
Try all possible combinations of your search keywords at once, and find related keywords with Google Sets.
Imagine you have a set of query words but are not sure that they’re the right set; you certainly don’t want to miss any results by picking the wrong combination of keywords, including or excluding the wrong word. But the thought of typing a dozen-plus permutations of keywords has your carpal tunnel flaring up in horror. With some existing tools, you can fine-tune your Google queries by playing with word sets—leading you down paths you might not have discovered.
Search Grid (http://blog.outer-court.com/search-grid), by German programmer Philipp Lenssen, lets you explore a wide range of Google search results by automatically searching for multiple combinations of keywords you specify. This gives you a quick overview of paths you can follow for a given set of keywords. You might, for example, put catsup, mustard, and pickles on the x-axis and relish, onions, and tomatoes on the y-axis, as shown in Figure 1-16.
Figure 1-16: Search Grid populated with keywords to combine
Search Grid combines the results—relish catsup, relish mustard, relish pickles, onions catsup, onions mustard, onions pickles, etc.—and provides you with the first result of each possible combination, shown in Figure 1-17.
Figure 1-17: The first of several different searches, all in one grid
Note that you get nothing but the first result; this is not the tool to use if you want an in-depth search of each query. Instead, it’s meant to give you a bird’s-eye view of how the different combinations of search words impact the query.
There’s also a version of Search Grid that’s been integrated into a web tool called FindForward (http://www.findforward.com/?t=grid), which gives you screenshots of some Google search results. FindFoward requires less typing: enter two to five words for which you want to check possible permutations. You get a large grid of search results, with screenshots available for some of the pages, as shown in Figure 1-18.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Hack Your Own Google Search Form
Build your own personal, task-specific Google search form.
If you want to do a simple search with Google, you need only the standard Simple Search form (the Google home page). But if you want to craft specific Google searches to use on a regular basis or provide for others, you can simply put together your own personalized search form.
Start with a garden-variety Google search form; something like this will do nicely:
<!-- Search Google -->
<form method="get" action="http://www.google.com/search">
<input type="text" name="q" size=31 maxlength=255 value="">
<input type="submit" name="sa" value="Search Google">
</form>
<!-- Search Google -->
This is a very simple search form. It takes your query and sends it directly to Google, adding nothing to it. But you can embed some variables to alter your search as needed. You can do this in two ways: via hidden variables or by adding more input to your form.
As long as you know how to identify a search option in Google, you can add it to your search form via a hidden variable. The fact it’s hidden just means that form users can’t alter it. They can’t even see it unless they look at the source code. Let’s look at a few examples.
While it’s perfectly legal HTML to put your hidden variables anywhere between the opening and closing <form> tags, it’s rather tidy and useful to keep them together after all the visible form fields.
File Type
As the name suggests, File Type specifies that your results are filtered by a particular file type (e.g., Word .doc, Adobe .pdf, PowerPoint .ppt, plain text .txt). Add a PowerPoint file type filter, for example, to your search form, like so:
<input type="hidden" name="as_filetype" value="PPT">
Site Search
Narrows your search to specific sites. While a suffix such as
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Compare Google and Yahoo! Search Results
Pit Google and Yahoo! against each other and find more search results in the process.
If you’ve ever searched for the same phrase at both Google and Yahoo!, you’ve probably noticed that the results can be surprisingly different. That’s because Google and Yahoo! have different ways of determining which sites are relevant for a particular phrase. Though both companies keep the exact way of how they determine the rank of results a secret—to thwart people who would take advantage of it—both Yahoo! and Google provide some clues about what goes into their ranking system.
At the heart of Google’s ranking system is a proprietary method it calls PageRank, and Google doesn’t give detailed information about it. But Google does say this:
Google’s order of results is automatically determined by more than 100 factors, including our PageRank algorithm.
Here’s the official word from Yahoo!:
Yahoo! Search ranks results according to their relevance to a particular query by analyzing the web page text, title, and description accuracy as well as its source, associated links, and other unique document characteristics.
Though we might never know exactly why results are different between the two search engines, at least we can have some fun spotting the differences—and end up with more search results than either one of the sites would have offered on their own.
One way to compare results is to simply open each site in separate browser windows and manually scan for differences. If you search for your favorite dog breed—say, "australian shepherd"—you’ll find that the top few sites are the same across both Yahoo! and Google, but the two search engines quickly diverge into different results. At the time of this writing, both sites estimate exactly 1,030,000 total results for this particular query, but estimated result counts might be a way to spot differences between the sites.
Viewing both sets of results in different windows is a bit tedious, and a clever Norwegian developer named Asgeir S. Nilsen has made the task easier, at a site called Twingine.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Cover Your Tracks
By understanding how your browser stores information related to your Google searches, you can be sure that your searches are your own.
Most of us think of our Google searches as something private, an exchange between one individual and Google. But if you share a computer with others, your searches might not be as private as you think. Whether you’re searching for a surprise birthday gift, a private medical concern, legal advice, or “researching” some risqu\x8e topic, there are times when your browser’s memory can come back to haunt you.
By default—in an effort to help your memory—your computer remembers your past Google searches and stores them so you can access them later. There are several ways your computer accomplishes this, and you should be aware of each of them if you want to cover your tracks completely.
The first and most obvious place that your browser stores your past searches is in your browser history. You can quickly view your current browser history in Firefox or Internet Explorer by typing Ctrl-H (Command-Shift-H on a Mac). A new pane will open that includes all of the sites you’ve visited recently, along with the specific pages at those sites, as shown in Figure 1-23.
Figure 1-23: Browser history pane in Firefox
From the pane on the left, you can easily revisit sites. Open the google.com folder to see recent searches, and note that other Google searches, such as Google images, are stored in its own folder, images.google.com. If you see a search you’d rather not share with others, you can simply highlight that particular entry, right-click, and click Delete on the menu.
Also be aware that your browser history is exposed through your address bar. As you start typing a URL into the address bar, the browser tries to guess where you want to go by offering matching URLs in your search history. If you start typing
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Improve Google’s Memory
With a feature called Search History, Google stores the searches you’ve made and the links you’ve followed so you can go back to them in the future.
Google is an impressive organizer of information, but it’s not very personable. Google is very much the same for me as it is for you. In fact, if you search Google with the word personable, you’ll see the same results I do. However, Google is working on technology that will tailor its search results to you as an individual. One step in that direction is the Search History (a.k.a. Personalized Search) feature, in beta testing at the time of this writing.
You’ve probably already experienced how Google’s memory can help you recall a search you did in the past. As you type letters into the main Google Search form, your browser tries to complete your thought, recalling past searches. This limited form of memory [Hack #11] can be handy, but it’s not terribly accurate or organized. For one, you can’t tell your browser which searches were successful and which weren’t. You can’t highlight favorite results or organize them in any way.
If you turn on Google’s memory through the Search History feature, you can let Google do the work of remembering how you use the site. In addition, you have access to your search history, no matter how you access the Web, because your history is stored at Google instead of your local computer.
The best way to get to know how Search History works is to try it out. You need a Google Account to use Search History; if you have a Gmail Account, you’re ready to go. If you don’t have a Google Account yet, browse to https://www.google.com/accounts/NewAccount and sign up. Google offers the option to disable Personalized Search when you create an account, as shown in Figure 1-28, but if you’re there to try out Search History, you need to leave this unchecked.
Figure 1-28: Google Account sign-up page
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Find Out What Google Thinks ___ Is
What does Google think of you, your friends, your neighborhood, or your favorite movie?
If you’ve ever wondered what people think of your hometown, your favorite band, your favorite snack food, or even you, Googlism (http://www.googlism.com) may provide you with something useful.
The interface is dirt simple. Enter your query and check the appropriate radio button to specify whether you’re looking for a who, a what, a where, or a when. Figure 1-31 shows a representative results page for Sherlock Holmes, famous fictional detective. You can also use the tabs to see what other objects people are searching for and what searches are the most popular.
Figure 1-31: Googlism results for Sherlock Holmes
Some of the results you find are not safe for work.
Googlism responds with a list of things Google believes about the query at hand, be it a person, place, thing, or moment in time. For example, a search for Perl and “What” returns, along with a laundry list of others:
Perl is y2k compliant
Perl is not my favourite programming language
Perl is the coder's language of choice
Perl is the language of love
These are among the more humorous results for Steve Jobs and “Who”:
steve jobs is my new idol
steve jobs is at it again
steve jobs is trying to kill me
To figure out what page any particular statement comes from, simply copy and paste it into a plain old Google search, with the complete phrase in quotes. That last statement, for instance, came from a 2002 blog post about iMacs at http://www.fismo.com/KeepUp/fog0000000025.html.
For the most part, this is a party hack—a good party hack. It’s a fun way to aggregate related statements into a silly (and occasionally profound) list.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Browse the World Wide Photo Album
Take a random stroll through the world’s photo album using some clever Google Image searches (and, optionally, a smidge of programming know-how).
The proliferation of digital cameras and the growing popularity of camera phones are turning the Web into a worldwide photo album. It’s not only the holiday snaps of your Aunt Minnie or the minutiae of your moblogging friend’s day that are available to you. You can actually take a stroll through the publicly accessible albums of perfect strangers if you know where to look. Happily, Google has copies, and a couple of hacks know just where to look.
Digital photo files have relatively standard filenames (e.g., DSC01018.JPG) by default and are usually uploaded to the Web without being renamed. The Random Personal Picture Finder (http://www.diddly.com/random) sports a clever little snippet of JavaScript code that simply generates one of these filenames at random and queries Google Images for it.
The result, shown in Figure 1-32, is something like looking through the world’s photo album: people eating, working, posing, and snapping photos of their cats, furniture, or toes. And since it’s a normal Google Images search, you can click on any photo to see the story behind it, and the other photos nearby.
Neat, huh?
Figure 1-32: The Random Personal Picture Finder
Note that people snap pictures of not just their toes (or the toes of others). While an informal series of Shift-Reloads in my browser turned up only a couple of questionable bits of photographic work, you should assume the results are not workplace- or child-safe.
The code behind the scenes, as I mentione