Skip to Content
Data Wrangling with Python
book

Data Wrangling with Python

by Dr. Tirthajyoti Sarkar, Shubhadeep Roychowdhury
February 2019
Beginner to intermediate
452 pages
7h 6m
English
Packt Publishing
Content preview from Data Wrangling with Python

Chapter 7

Advanced Web Scraping and Data Gathering

Learning Objectives

By the end of this chapter, you will be able to:

  • Make use of requests and BeautifulSoup to read various web pages and gather data from them
  • Perform read operations on XML files and the web using an Application Program Interface (API)
  • Make use of regex techniques to scrape useful information from a large and messy text corpus

In this chapter, you will learn how to gather data from web pages, XML files, and APIs.

Introduction

The previous chapter covered how to create a successful data wrangling pipeline. In this chapter, we will build a real-life web scraper using all of the techniques that we have learned so far. This chapter builds on the foundation of BeautifulSoup ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Data Wrangling with Python

Data Wrangling with Python

Jacqueline Kazil, Katharine Jarmul
Python: End-to-end Data Analysis

Python: End-to-end Data Analysis

Phuong Vothihong, Martin Czygan, Ivan Idris, Magnus Vilhelm Persson, Luiz Felipe Martins

Publisher Resources

ISBN: 9781789800111Supplemental Content