O'Reilly logo

Webbots, Spiders, and Screen Scrapers by Michael Schrenk

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Creating the Image-Capturing Webbot

This example webbot relies on a library called LIB_download_images, which is available from this book's website. This library contains the following functions:

  • download_binary_file(), which safely downloads image files

  • mkpath(), which makes directory structures on your hard drive

  • download_images_for_page(), which downloads all the images on a page

Re-creating a file structure for stored images

Figure 8-2. Re-creating a file structure for stored images

For clarity, I will break down this library into highlights and accompanying explanations.

The first script (Listing 8-1) shows the main webbot used in Figure 8-1 and Figure 8-2.

 include("LIB_download_images.php"); ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required