O'Reilly logo

Python in a Nutshell, 2nd Edition by Alex Martelli

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

The sgmllib Module

The name of the sgmllib module is misleading: sgmllib parses only a tiny subset of SGML, but it is still a good way to get information from HTML files. sgmllib supplies one class, SGMLParser, which you subclass, overriding methods. The most frequently used methods of an instance s of your subclass X of SGMLParser are as follows.

close

s.close( )

Tells the parser that there is no more input data. When X overrides close, s.close must call SGMLParser.close to ensure that buffered data is processed.

do_tag

s.do_tag(attributes)

X supplies a method with such a name for each tag, with no corresponding end tag, that X wants to process. tag must be lowercase in the method name, but can be in any case in the parsed text (the SGML standard, like HTML, is case-insensitive, in contrast to XML and XHTML, which are case-sensitive). SGMLParser’s handle_tag method calls do_tag when appropriate. attributes is a list of pairs (name,value), where name is an attribute’s name, lowercased, and value is the value, processed to resolve entity and character references and remove surrounding quotes.

end_tag

s.end_tag()

X supplies a method with such a name for each tag whose end tag X wants to process. tag must be lowercase in the method name, but can be in any case in the parsed text. X must also supply a method named start_tag; otherwise, end_tag is ignored. SGMLParser’s handle_endtag method calls end_tag when appropriate.

feed

s.feed(data)

Passes to the parser some of the text being parsed. The ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required