Prospects for Improved Web Search Methods
Part of the hype of XML has been that web search engines will finally understand what a document means by looking at its markup. For instance, you can search for the movie Sneakers and just get back hits about the movie without having to sort through “Internet Wide Area `Tiger Teamers’ mailing list,” “Children’s Side Zip Sneakers Recalled by Reebok,” “Infant’s `Little Air Jordan’ Sneakers Recalled by Nike,” “Sneakers.com—Athletic shoes from Nike, Reebok, Adidas, Fila, New,” and the 32,395 other results that Google pulled up on this search that had nothing to do with the movie.[1]
In practice, this is still vapor, mostly because few web pages
are available on the frontend in XML, even though more and more
backends are XML. The search-engine robots only see the frontend HTML.
As this slowly changes, and as the search engines get smarter, we
should see more and more useful results. Meanwhile, it’s possible to
add some XML hints to your HTML pages that knowledgeable search
engines can take advantage of using the Resource Description Framework
(RDF), the Dublin Core, and the robots processing instruction.
[1] In fairness to Google, four of the first ten hits it returned were about the movie.