In order to avoid exhaustive iteration and the checking of every element, we need to use XPath, which is a query language that was developed specifically for XML, and is supported by lxml.
To get started with XPath, use the Python shell from the last section, and do the following:
>>> root.xpath('body') [<Element body at 0x4477530>]
This is the simplest form of XPath expression; it searches for children of the current element that have tag names that match the specified tag name. The current element is the one we call xpath() on—in this case, root. The root element is the top-level <html> element in the HTML document, and so the returned element is the <body> element.
XPath expressions can contain multiple levels of elements. ...