Skip to Content
Learn Python by Building Data Science Applications
book

Learn Python by Building Data Science Applications

by Philipp Kats, David Katz
August 2019
Beginner
482 pages
12h 56m
English
Packt Publishing
Content preview from Learn Python by Building Data Science Applications

Chapter 7

What does the term web scraping mean?

Web scraping is the process of collecting information directly from HTML web pages. Just like mining, we have to first collect ore of the HTML, from which we can then refine the valuable data points.

What are the main differences between scraping and using a web API? What are the challenges?

The main difference is the lack of any guarantees – there is no promise that the web page won't change in terms of its structure, or will be shown at all. In fact, many services actively attempt to prevent web scraping. Another challenge is processing raw HTML into valuable information, as it often requires some custom code. 

What exactly does Beautiful Soup do? Can we scrape without it?

In our stack (

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Python for Data Science

Python for Data Science

Yuli Vasiliev
Introduction to Machine Learning with Python

Introduction to Machine Learning with Python

Andreas C. Müller, Sarah Guido

Publisher Resources

ISBN: 9781789535365Supplemental Content