Skip to Content
Dive Into Data Science
book

Dive Into Data Science

by Bradford Tuckfield
July 2023
Beginner
288 pages
8h 11m
English
No Starch Press
Content preview from Dive Into Data Science

8 Web Scraping

You need data to do data science, and when you don’t have a dataset on hand, you can try web scraping, a set of techniques for reading information directly from public websites and converting it to usable datasets. In this chapter, we’ll cover some common web-scraping techniques.

We’ll start with the simplest possible kind of scraping: downloading a web page’s code and looking for relevant text. We’ll then discuss regular expressions, a set of methods for searching logically through text, and Beautiful Soup, a free Python library that can help you parse websites more easily by directly accessing HyperText Markup Language (HTML) ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Learning Data Science

Learning Data Science

Sam Lau, Joseph Gonzalez, Deborah Nolan

Publisher Resources

ISBN: 9781098156879Errata Page