Getting ready

We've gathered some sailboat racing results in Volvo Ocean Race.html. This file has information on teams, legs, and the order in which the various teams finished each leg. It's scraped from the Volvo Ocean Race website, and it looks wonderful when opened in a browser.

HTML notation is very similar to XML. The content is surrounded by <tag> marks that show the structure and presentation of the data. HTML predates XML, and the XHTML standard reconciles the two Browsers; however, must be tolerant of older HTML and even improperly structured HTML. The presence of damaged HTML can make it difficult to analyze data from the World Wide Web.

HTML pages include a great deal of overhead. There are often vast code and style sheet sections, ...

Get Modern Python Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.