August 2019
Beginner
482 pages
12h 56m
English
What does the term web scraping mean?
Web scraping is the process of collecting information directly from HTML web pages. Just like mining, we have to first collect ore of the HTML, from which we can then refine the valuable data points.
What are the main differences between scraping and using a web API? What are the challenges?
The main difference is the lack of any guarantees – there is no promise that the web page won't change in terms of its structure, or will be shown at all. In fact, many services actively attempt to prevent web scraping. Another challenge is processing raw HTML into valuable information, as it often requires some custom code.
What exactly does Beautiful Soup do? Can we scrape without it?
In our stack (