Aggregating and Repackaging Internet Services

We’ve talked about using public search engines as components of a custom search application. Let’s work through two examples of that idea. The first gathers information from a set of technology news sites. The second trolls a set of public LDAP directories.

A Technology News Metasearcher

The Web nowadays offers more search engines than you can keep track of. Even metasearchers—that is, applications that aggregate multiple search engines—are becoming common. For example, I use a tool called Copernic (http://www.copernic.com/) to do parallel searches of AltaVista, Excite, and a number of other engines.

Despite the wealth of searchers and metasearchers, there’s always a need for a more customized solution. The analysts at our fictitious Ronin Group, for example, track technology news by subject. The technology news sites, including PR Newswire (http://www.prnewswire.com/), Business Wire (http://www.businesswire.com/), and Yahoo! (http://www.yahoo.com/), deliver lots of fresh technology news. But consider the plight of the Ronin Group’s XML analyst. None of the available metasearchers cover all the technology news sites that she’d like to include in her daily search for XML-related news.

What to do? It’s straightforward to build a custom metasearcher. You discover the web API for each engine that you want to search, create a URL template, interpolate a search term into that URL template, transmit the URL, and interpret the results. ...

Get Practical Internet Groupware now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.