Now that we know how to scrape the country data, we can integrate this into the link crawler built in Chapter 1, Introduction to Web Scraping. To allow reusing the same crawling code to scrape multiple websites, we will add a callback parameter to handle the scraping. A callback is a function that will be called after certain events (in this case, after a web page has been downloaded). This scrape callback will take a url and html as parameters and optionally return a list of further URLs to crawl. Here is the implementation, which is simple in Python:
def link_crawler(..., scrape_callback=None): ... data = [] if scrape_callback: data.extend(scrape_callback(url, html) or []) ...
The new ...