O'Reilly logo

Webbots, Spiders, and Screen Scrapers by Michael Schrenk

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Error Handlers

When a webbot cannot adjust to changes, the only safe thing to do is to stop it. Not stopping your webbot may otherwise result in odd performance and suspicious entries in the target server's access and error log files. It's a good idea to write a routine that handles all errors in a prescribed manner. Such an error handler should send you an email that indicates the following:

  • Which webbot failed

  • Why it failed

  • The date and time it failed

A simple script like the one in Listing 25-12 works well for this purpose.

function webbot_error_handler($failure_mode)
    {
    # Initialization
    $email_address = "your.account@someserver.com"; $email_subject = "Webbot Failure Notification"; # Build the failure message $email_message = "Webbot T-Rex encountered ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required