O'Reilly logo

Webbots, Spiders, and Screen Scrapers by Michael Schrenk

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Further Exploration

You can point this webbot at any web page, and it will generate a copy of each image that page uses, arranged in a directory structure that resembles the original. You can also develop other useful webbots based on this design. If you want to test your skills, consider the following challenges.

  • Write a similar webbot that detects hijacked images.

  • Improve the efficiency of the script by reworking it so that it doesn't download an image it has downloaded previously.

  • Modify this webbot to create local backup copies of web pages.

  • Adjust the webbot to cache movies or audio files instead of images.

  • Modify the bot to monitor when images change on a web page.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required