7
Web Scraping
In this final chapter of the Dealing with Data part of the book, we'll be learning how to collect data from web sources. This includes using Python modules and packages to scrape data straight from webpages and Application Programming Interfaces (APIs). We'll also learn how to use so-called "wrappers" around APIs to collect and store data. Since new, fresh data is being created every day on the internet, web scraping opens up huge opportunities for data collection.
In this chapter, we'll cover the following:
- Understanding the structure of the internet
- Performing simple web scraping
- Parsing HTML from scraped pages
- Using APIs to collect data
- The ethics and legality of web scraping
We'll learn these topics using the following Python ...
Get Practical Data Science with Python now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.