Skip to Content
Practical Data Science with Python
book

Practical Data Science with Python

by Nathan George
September 2021
Beginner to intermediate
620 pages
15h 30m
English
Packt Publishing
Content preview from Practical Data Science with Python

7

Web Scraping

In this final chapter of the Dealing with Data part of the book, we'll be learning how to collect data from web sources. This includes using Python modules and packages to scrape data straight from webpages and Application Programming Interfaces (APIs). We'll also learn how to use so-called "wrappers" around APIs to collect and store data. Since new, fresh data is being created every day on the internet, web scraping opens up huge opportunities for data collection.

In this chapter, we'll cover the following:

  • Understanding the structure of the internet
  • Performing simple web scraping
  • Parsing HTML from scraped pages
  • Using APIs to collect data
  • The ethics and legality of web scraping

We'll learn these topics using the following Python ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Data Science Projects with Python - Second Edition

Data Science Projects with Python - Second Edition

Stephen Klosterman
Python: End-to-end Data Analysis

Python: End-to-end Data Analysis

Phuong Vothihong, Martin Czygan, Ivan Idris, Magnus Vilhelm Persson, Luiz Felipe Martins

Publisher Resources

ISBN: 9781801071970Supplemental Content