Part II. Advanced Scraping

Youâve laid some web-scraping groundwork; now comes the fun part. Up until this point our web scrapers have been relatively dumb. Theyâre unable to retrieve information unless itâs immediately presented to them in a nice format by the server. They take all information at face value and simply store it without any analysis. They get tripped up by forms, website interaction, and even JavaScript. In short, theyâre no good for retrieving information unless that information really wants to be retrieved.

This part of the book will help you analyze raw data to get the story beneath the dataâthe story that websites often hide beneath layers of JavaScript, login forms, and antiscraping measures.

Youâll learn how to use web scrapers to test your sites, automate processes, and access the Internet on a large scale. By the end of this section, you should have the tools to gather and manipulate nearly any type of data, in any form, across any part of the Internet.

Get Web Scraping with Python now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Web Scraping with Python by Ryan Mitchell

Part II. Advanced Scraping

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly