Chapter 8. Processing HTML

Several modules included with Python provide virtually all the necessary tools necessary to parse and process HTML documents without needing to use a web server or web browser. Parsing HTML files is becoming much more commonplace in such applications as search engines, document indexing, document conversion, data retrieval, site backup or migration, as well as several others.

Because there is no way to cover the extent of options Python provides in HTML processing, the first two phrases in this chapter focus on specific Python modules to simplify opening HTML documents locally and on the Web. The rest of the phrases discuss how to use the Python modules to quickly parse the data in the HTML files to process specific items, ...

Get Python Phrasebook: Essential Code and Commands now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.