Skip to Main Content
The Data Wrangling Workshop - Second Edition
book

The Data Wrangling Workshop - Second Edition

by Brian Lipp, Shubhadeep Roychowdhury, Dr. Tirthajyoti Sarkar, John Wesley Doyle, Harshil Jain, Robert Thas John, Akshay Khare, Nagendra Nagaraj, Samik Sen, Dr. Vlad Sebastian Ionescu
July 2020
Beginner to intermediate content levelBeginner to intermediate
576 pages
9h 12m
English
Packt Publishing
Content preview from The Data Wrangling Workshop - Second Edition

5. Getting Comfortable with Different Kinds of Data Sources

Overview

This chapter will provide you with the skills to read CSV, Excel, and JSON files into pandas DataFrames. You will learn how to read PDF documents and HTML tables into pandas DataFrames and perform basic web scraping operations using powerful yet easy-to-use libraries such as Beautiful Soup. You will also see how to extract structured and textual information from portals. By the end of this chapter, you will be able to implement data wrangling techniques such as web scraping in the real world.

Introduction

So far in this book, we have focused on studying pandas DataFrame objects as the main data structure for the application of wrangling techniques. In this chapter, we ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

The Data Analysis Workshop

The Data Analysis Workshop

Gururajan Govindan, Shubhangi Hora, Konstantin Palagachev, Brent Broadnax, John Wesley Doyle, Ashish Jain, Robert Thas John, Ravi Ranjan Prasad Karn, Pritesh Tiwari
The Data Visualization Workshop

The Data Visualization Workshop

Mario Döbler, Tim Großmann, Rohan Chikorde, Joshua Görner, Anshu Kumar, Piotr Malak, Ankit Verma
The Data Science Workshop - Second Edition

The Data Science Workshop - Second Edition

Anthony So, Thomas Joseph, Robert Thas John, Andrew Worsley, Dr. Samuel Asare
The Machine Learning Workshop - Second Edition

The Machine Learning Workshop - Second Edition

Hyatt Saleh, John Wesley Doyle, Akshat Gupta, Harshil Jain, Vikraman Karunanidhi, Subhojit Mukherjee, Madhav Pandya, Subhash Sundaravadivelu

Publisher Resources

ISBN: 9781839215001Supplemental Content