Part I. Building Scrapers
This first part of the book focuses on the basic mechanics of web scraping: how to use Python to request information from a web server, how to perform basic handling of the serverâs response, and how to begin interacting with a website in an automated fashion. By the end, youâll be cruising around the Internet with ease, building scrapers that can hop from one domain to another, gather information, and store that information for later use.
To be honest, web scraping is a fantastic field to get into if you want a huge payout for relatively little upfront investment. In all likelihood, 90% of web scraping projects youâll encounter will draw on techniques used in just the next six chapters. This section covers what the general (albeit technically savvy) public tends to think of when they think of âweb scrapersâ:
- Retrieving HTML data from a domain name
- Parsing that data for target information
- Storing the target information
- Optionally, moving to another page to repeat the process
This will give you a solid foundation before moving on to more complex projects in part II. Donât be fooled into thinking that this first section isnât as important as some of the more advanced projects in the second half. You will use nearly all the information in the first half of this book on a daily basis while writing web scrapers!
Get Web Scraping with Python now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.