O'Reilly logo

Python in a Nutshell, 2nd Edition by Alex Martelli

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

The htmllib Module

The htmllib module supplies a class named HTMLParser that subclasses SGMLParser and defines start_tag, do_tag, and end_tag methods for HTML 2.0 tags. HTMLParser implements and overrides methods to perform calls to methods of a formatter object, covered in “The formatter Module” in The htmllib Module. You can subclass HTMLParser and override methods. In addition to start_tag, do_tag, and end_tag methods, an instance h of HTMLParser supplies the following attributes and methods.

anchor_bgn

h.anchor_bgn(href,name,type)

Called for each <a> tag. href, name, and type are the string values of the tag’s attributes with the same names. HTMLParser’s implementation of anchor_bgn maintains a list of outgoing hyperlink targets (i.e., href arguments of method s.anchor_bgn) in an instance attribute named s.anchorlist.

anchor_end

h.anchor_end( )

Called for each </a> end tag. HTMLParser’s implementation of anchor_end emits to the formatter a footnote reference that is an index within s.anchorlist. In other words, by default, HTMLParser asks the formatter to format an <a>/</a> tag pair as the text inside the tag, followed by a footnote reference number that points to the URL in the <a> tag. Of course, it’s up to the formatter to deal with this formatting request.

anchorlist

The h.anchor_list attribute contains the list of outgoing hyperlink target URLs, as built by method h.anchor_bgn.

formatter

The h.formatter attribute is the formatter object f associated with h, which you pass as the ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required