© Seppe vanden Broucke and Bart Baesens 2018
Seppe vanden Broucke and Bart BaesensPractical Web Scraping for Data Sciencehttps://doi.org/10.1007/978-1-4842-3582-9_3

3. Stirring the HTML and CSS Soup

Seppe vanden Broucke1  and Bart Baesens2
(1)
KU Leuven, Leuven, Belgium
(2)
Dept of Decision Sci & Info Managem, KU Leuven Dept of Decision Sci & Info Managem, Leuven, Belgium
 

So far we have discussed the basics of HTTP and how you can perform HTTP requests in Python using the requests library. However, since most web pages are formatted using the Hypertext Markup Language (HTML), we need to understand how to extract information from such pages. As such, this chapter introduces you to HTML, as well as another core building block that is used to format and ...

Get Practical Web Scraping for Data Science: Best Practices and Examples with Python now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.