February 2018
Beginner to intermediate
364 pages
10h 32m
English
We will start with a fresh iPython session and start by loading the planets page:
In [1]: import requests ...: from bs4 import BeautifulSoup ...: html = requests.get("http://localhost:8080/planets.html").text ...: soup = BeautifulSoup(html, "lxml") ...:
In the previous recipe, to access all of the <tr> in the table, we used a chained property syntax to get the table, and then needed to get the children and iterator over them. This does have a problem as the children could be elements other than <tr>. A more preferred method of getting just the <tr> child elements is to use findAll.
Lets start by first finding the <table>:
In [4]: table = soup.find("table") ...: str(table)[:100] ...:Out[4]: '<table border="1" id="planetsTable">\n<tr ...Read now
Unlock full access