Data Acquisition and Extraction

In this chapter, we will cover:

  • How to parse websites and navigate the DOM using BeautifulSoup
  • Searching the DOM with Beautiful Soup's find methods
  • Querying the DOM with XPath and lxml
  • Querying data with XPath and CSS Selectors
  • Using Scrapy selectors
  • Loading data in Unicode / UTF-8 format

Get Python Web Scraping Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.