Meandering Your Google Neighborhood

Google Neighborhood attempts to detangle the Web by building a “neighborhood” of sites around a URL.

It’s called the World Wide Web, not the World Wide Straight Line. Sites link to other sites, building a “web” of sites. And what a tangled web we weave.

Google Neighborhood attempts to detangle some small portion of the Web by using the Google API to find sites related to a URL you provide, scraping the links on the sites returned, and building a “neighborhood” of sites that link both the original URL and each other.

If you’d like to give this hack a whirl without having to run it yourself, there’s a live version available at http://diveintomark.org/archives/2002/06/04.html#who_are_the_people_in_your_neighborhood. The source code (included below) for Google Neighborhood is available for download from http://diveintomark.org/projects/misc/neighbor.py.txt.

The Code

Google Neighborhood is written in the Python (http://www.python.org) programming language. Your system will need to have Python installed for you to run this hack.

""" neighbor.cgi Blogroll finder and aggregator """ _ _author_ _ = "Mark Pilgrim (f8dy@diveintomark.org)" _ _copyright_ _ = "Copyright 2002, Mark Pilgrim" _ _license_ _ = "Python" try: import timeoutsocket # http://www.timo-tasi.org/python/timeoutsocket.py timeoutsocket.setDefaultSocketTimeout(10) except: pass import urllib, urlparse, os, time, operator, sys, pickle, re, cgi, time from sgmllib import SGMLParser ...

Get Google Hacks now with the O’Reilly learning platform.

O’Reilly members experience live online training, plus books, videos, and digital content from nearly 200 publishers.