May 2017
Beginner to intermediate
220 pages
5h 2m
English
Now we are ready to try DiskCache with our crawler by passing it to the cache keyword argument. The source code for this class is available at https://github.com/kjam/wswp/blob/master/code/chp3/diskcache.py, and the cache can be tested in any Python interpreter.
Here, we use IPython to help time our request to test its performance:
In [1]: from chp3.diskcache import DiskCacheIn [2]: from chp3.advanced_link_crawler import link_crawlerIn [3]: %time link_crawler('http://example.webscraping.com/', '/(index|view)', cache=DiskCache()) ...Read now
Unlock full access