Digging Deeper into Sites

Dig deeper into the hierarchies of web sites matching your search criteria.

One of Google’s big strengths is that it can find your search term instantly and with great precision. But sometimes you’re not interested so much in one definitive result as in lots of diverse results; maybe you even want some that are a bit more on the obscure side.

One method I’ve found rather useful is to ignore all results shallower than a particular level in a site’s directory hierarchy. You avoid all the clutter of finds on home pages and go for subject matter otherwise often hidden away in the depths of a site’s structure. While content comes and goes, ebbs and flows from a site’s main focus, it tends to gather in more permanent locales, categorized and archived, like with like.

This script asks for a query along with a preferred depth, above which results are thrown out. Specify a depth of four and your results will come only from http://example.com/a/b/c/d, not /a, /a/b/, or /a/b/c.

Because you’re already limiting the kinds of results you see, it’s best to use more common words for what you’re looking for. Obscure query terms can often cause absolutely no results to turn up.

Tip

The default number of loops, retrieving 10 items apiece, is set to 50. This is to assure you glean some decent number of results, because many will be tossed. You can, of course, alter this number but bear in mind that you’re using that number of your daily quota of 1,000 Google API queries per developer’s ...

Get Google Hacks now with the O’Reilly learning platform.

O’Reilly members experience live online training, plus books, videos, and digital content from nearly 200 publishers.