Despite all that the Google API does, there are folks (myself included) who wish it would do more. Then there are those who started building programmatic access to Google long before an the API became available. This survey covers a few of them.
We present them here for two reasons: to give you an idea what software you don’t want to use if you have a concern about being banned from Google, and to inspire you. This software wasn’t written because someone was sitting around trying to violate Google’s TOS; it was written because someone simply wanted to get something done. They’re creative and pragmatic and well worth a look.
Google’s Terms of Service (TOS) prohibit automated querying of the database except in conjunction with the Google API. Automatic searching for whatever reason is a big no-no. Google can react to this very strongly; in fact, they have temporarily banned whole IP address blocks based on the actions of a few, so be careful what you use to query Google.
Here is a list of tools to avoid, unless you don’t mind getting yourself banned:
- Gnews2RSS (http://www.voidstar.com/gnews2rss.php?q=news&num=15)
Turns a Google News search into a form suitable for syndication.
- WebPosition Gold (http://www.webposition.com/)
Performs a range of tasks for web wranglers, including designing more search engine-friendly pages, supporting automated URL submissions, and analyzing search engine traffic to a site. Unfortunately, their automated rank-checking feature violates Google’s Terms of Service. This program does so many things, however, that you could consider using it for some position-checking tasks alone.
- AgentWebRanking (http://www.agentwebranking.com/)
Checks your web page’s ranking with dozens of major search engines all over the world. That search engine list also includes Google, though the program violates Google’s Terms of Service by going around the Google API.
When reviewing search engine tools, keep an eye out for those that:
Offer an automated search and retrieval of special collections not covered by the Google API, such as Google News and Google Catalogs
Frame, metasearch, or otherwise use Google’s content without apparent agreement or partnership with Google