Special spider functions are found in the
LIB_simple_spider library. This library provides functions that parse links from a web page when given a URL, archive harvested links in an array, identify the root domain for a URL, and identify links that should be excluded from the archive.
This library, as well as the other scripts featured in this chapter, is available for download at this book’s website.
Harvested: http://video.google.com/videoplay?docid=4221457095668033104&hl=en Harvested: http://www.apogeonline.com/libri/88-503-2658-0/scheda Harvested: http://www.schrenk.com/index.php Harvested: http://www.schrenk.com/strategies.php Harvested: http://www.schrenk.com/webbots.php ...