This section concludes our studies of microformats—and Thai food—by briefly introducing hReview. Yelp is a popular service that implements hReview so that the ratings customers have left for restaurants can be exposed. Example 2-9 demonstrates how to extract hReview information as implemented by Yelp. A sample URL you might try is in the sample code and represents a Thai restaurant you definitely don’t want to miss if you ever have the opportunity to visit it.
Although the spec is pretty stable, hReview implementations seem
to vary and include arbitrary deviations. In particular, Example 2-9 does not parse the
reviewer as an hCard because Yelp’s
implementation did not include it as such.
Example 2-9. Parsing hReview data for a Pad Thai recipe (microformats__yelp_hreview.py)
# -*- coding: utf-8 -*- import sys import re import urllib2 import json import HTMLParser from BeautifulSoup import BeautifulSoup # Pass in a URL that contains hReview info such as # http://www.yelp.com/biz/bangkok-golden-fort-washington-2 url = sys.argv # Parse out some of the pertinent information for a Yelp review # Unfortunately, the quality of hReview implementations varies # widely so your mileage may vary. This code is *not* a spec # parser by any stretch. See http://microformats.org/wiki/hreview def parse_hreviews(url): try: page = urllib2.urlopen(url) except urllib2.URLError, e: print 'Failed to fetch ' + url raise e try: soup = BeautifulSoup(page) except HTMLParser.HTMLParseError, ...