How to Run the Hacks

The programmatic hacks in this book run either on the command line (that’s Terminal for Mac OS X folk, DOS command window for Windows users) or as CGI scripts—dynamic pages living on your web site, accessed through your web browser.

Command-Line Scripts

Running a hack on the command line invariably involves the following steps:

  1. Type the program into a garden-variety text editor: Notepad on Windows, TextEdit on Mac OS X, vi or Emacs on Unix/Linux, or anything else of the sort. Save the file as directed—usually as scriptname.pl (the pl bit stands for Perl, the predominant programming language used in Google Hacks).

    Alternatively, you can download the code for all of the hacks online at http://www.oreilly.com/catalog/googlehks2, a ZIP archive filled with individual scripts already saved as text files.

  2. Get to the command line on your computer or remote server. In Mac OS X, launch the Terminal (ApplicationsUtilitiesTerminal). In Windows, click the Start button, select Run..., type command, and hit the Enter/Return key on your keyboard. In Unix ... well, we’ll just assume you know how to get to the command line.

  3. Navigate to where you saved the script at hand. This varies from operating system to operating system, but usually involves something like cd ~/Desktop (that’s your Desktop on the Mac).

  4. Invoke the script by running the programming language’s interpreter (e.g., Perl) and feeding it the script (e.g., scriptname.pl) like so:

    $ perl  
                         scriptname.pl
  5. Most often, you’ll also need to pass along some parameters—your search query, the number of results you’d like, and so forth. Simply drop them in after the script name, enclosing them in quotes if they’re more than one word or if they include an odd character or three:

    $ perl  
                         scriptname.pl '"much ado about nothing" script' 10
  6. The results of your script are almost always sent straight back to the command-line window in which you’re working, like so:

    $ perl  
                         scriptname.pl '"much ado about nothing" script' 10
    
    1. "Amazon.com: Books: Much Ado About Nothing: Screenplay ..." 
    [http://www.amazon.com/exec/obidos/tg/detail/-/0393311112?v=glance]
    2. "Much Ado About Nothing Script" 
    [http://www.signal42.com/much_ado_about_nothing_script.asp]
    ...

    Tip

    The elllpsis (...) bit signifies that we’ve cut off the output for brevity’s sake.

  7. To stop output scrolling off your screen faster than you can read it, on most systems you can “pipe” (read: redirect) the output to a little program called more:

    $ perl  
                         scriptname.pl 
                         | more

    Hit the Enter/Return key on your keyboard to scroll through line by line, the space bar to leap through page by page.

    You’ll also sometimes want to direct output to a file for safekeeping, importing into your spreadsheet application, or displaying on your web site. This is as easy; refer to the code shown next.

    $ perl  
                         scriptname.pl 
                         > 
                         output_filename.txt

    And to pour some input into your script from a file, simply do the opposite:

    $ perl  
                         scriptname.pl 
                         < 
                         input_filename.txt

Don’t worry if you can’t remember all of this; each hack has a “Running the Hack” section, and some even have a “The Results” section that shows you just how it’s done.

CGI Scripts

CGI scripts—programs that run on your web site and produce pages dynamically—are a little more complicated if you’re not used to them. While fundamentally they’re the same sort of scripts as those run on the command line, they are more troublesome because setups vary so widely. You may be running your own server, your web site may be hosted on an Internet service provider’s (ISP) server, your content may live on a corporate intranet server—or anything in between.

Since going through every possibility is beyond the scope of this (or any) book, you should check your ISP’s knowledge base or call their technical support department, or ask your local system administrator for help.

Generally, though, the methodology is the same:

  1. Type the program in to a garden-variety text editor: Notepad on Windows, TextEdit on Mac OS X, vi or Emacs on Unix/Linux, or anything else of the sort. Save the file as directed—usually as scriptname.cgi (the cgi bit reveals that you’re dealing with a CGI—that’s common gateway interface—script).

  2. Alternatively, you can download the code for all of the hacks online at http://www.oreilly.com/catalog/googlehks2, a ZIP archive filled with individual scripts already saved as text files.

  3. Move the script over to wherever your web site lives. You should have some directory on a server somewhere in which all of your web pages (all those .html files) and images (ending in .jpg, .gif, etc.) live. Within this directory, you’ll probably see something called a cgi-bin directory: this is where CGI scripts must usually live in order to be run rather than just displayed in your web browser when you visit them.

  4. You usually need to bless CGI scripts as executable—to be run rather than displayed. Just how you do this depends on the operating system of your server. If you’re on a Unix/Linux or Mac OS X system, this usually entails typing the following on the command line:

    $ chmod 755 
                         scriptname.cgi
  5. Now you should be able to point your web browser at the script and have it run as expected, behaving in a manner similar to that described in the “Running the Hack” section of the hack at hand.

  6. Just what URL you use once again varies wildly. It should, however, look something like this: http://www.your_domain.com/cgi-bin/scriptname.cgi, where your_domain.com is your web site domain, cgi-bin refers to the directory in which your CGI scripts live, and scriptname.cgi is the script itself.

  7. If you don’t have a domain and are hosted at an ISP, the URL is more likely to look like this: http://www.your_isp.com/~your_username/cgi-bin/scriptname.cgi, where your_isp.com is your ISP’s domain, ~ your_username is your username at the ISP, cgi-bin refers to the directory in which your CGI scripts live, and scriptname .cgi is the script itself.

If you come up with something called an “Internal Server Error” or see the error code 500, something’s gone wrong somewhere in the process. At this point you can take a crack at debugging (read: shaking the bugs out) yourself or ask your ISP or system administrator for help. Debugging—especially CGI debugging—can be a little more than the average newbie can bear, but there is help in the form of a famous Frequently Asked Question (FAQ): “The Idiot’s Guide to Solving Perl CGI Problems.” Google for it and step through as directed.

Using the Google API

Be sure to consult Chapter 9 for an introduction to the Google API, how to sign up for a developer’s key—you’ll need one for many of the hacks in this book—and the basics of programming Google in a selection of languages to get you going.

Learning to Code

Fancy trying your hand at a spot of programming? O’Reilly’s best-selling Learning Perl (http://www.oreilly.com/catalog/lperl3) by Randal L. Schwartz and Tom Phoenix provides a good start. Apply what you learn to understanding and using the hacks in this book, perhaps even taking on the “Hacking the Hack” sections to tweak and fiddle with the scripts. This is a useful way to get a little programming under your belt if you’re a searching nut, since it’s always a little easier to learn how to program when you have a task to accomplish and existing code to leaf through.

Get Google Hacks, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.