Parsing and extracting web content
Well, now we're confident about making HTTP requests to multiple URLs. We also looked at a simple example of web scraping.
But WWW is made up of pages with multiple data formats. If we want to scrape the Web and make sense of the data, we should also know how to parse different formats in which data is available on the Web.
In this recipe, we'll discuss how to s.
Data on the Web is mostly in the HTML or XML format. To understand how to parse web content, we'll take an example of an HTML file. We'll learn how to select certain HTML elements and extract the desired data. For this recipe, you need to install the
BeautifulSoup module of Python. The
BeautifulSoup module is one of the most comprehensive Python ...