Chapter 2. Microformats: Semantic Markup and Common Sense Collide

In terms of the Web’s ongoing evolution, microformats are an important step forward because they provide an effective mechanism for embedding “smarter data” into web pages and are easy for content authors to implement. Put succinctly, microformats are simply conventions for unambiguously including structured data into web pages in an entirely value-added way. This chapter begins by briefly introducing the microformats landscape and then digs right into some examples involving specific uses of the XFN (XHTML Friends Network), geo, hRecipe, and hReview microformats. In particular, we’ll mine human relationships out of blogrolls, extract coordinates from web pages, parse out recipes from, and analyze reviews on some of those recipes. The example code listings in this chapter aren’t implemented with the intention of being “full spec parsers,” but should be more than enough to get you on your way.

Although it might be somewhat of a stretch to call data decorated with microformats like geo or hRecipe “social data,” it’s still interesting and will inevitably play an increased role in social data mashups. At the time this book was written, nearly half of all web developers reported some use of microformats, the community had just celebrated its fifth birthday, and Google reported that 94% of the time, microformats are involved in Rich Snippets. If Google has anything to say about it, we’ll ...

Get Mining the Social Web now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.