July 2017
Beginner to intermediate
715 pages
17h 3m
English
Web crawling is the process of traversing a series of interconnected web pages and extracting relevant information from those pages. It does this by isolating and then following links on a page. While there are many precompiled datasets readily available, it may still be necessary to collect data directly off the Internet. Some sources such as news sites are continually being updated and need to be revisited from time to time.
A web crawler is an application that visits various sites and collects information. The web crawling process consists of a series of steps:
This process is repeated for each URL visited. ...