May 2019
Beginner to intermediate
466 pages
10h 44m
English
The technique for extracting data from web pages using software is called web scraping. It is an important component of data harvesting, typically implemented through programs called web crawlers. Data harvesting or data mining is a useful technique, often used in data science workflows to collect information from the internet, usually from websites (as opposed to APIs), and then to process that data for different purposes using various algorithms.
At a very high level, the process involves making a request for a web page, fetching its content, parsing its structure, and then extracting the desired information. This can be images, paragraphs of text, or tabular data containing stock information and prices, ...
Read now
Unlock full access