Errata

Spidering Hacks

Errata for Spidering Hacks

Submit your own errata for this product.

The errata list is a list of errors and their corrections that were found after the product was released. If the error was corrected in a later version or reprint the date of the correction will be displayed in the column titled "Date Corrected".

The following errata were submitted by our customers and approved as valid errors by the author or editor.

Color key: Serious technical mistake Minor technical mistake Language or formatting error Typo Question Note Update

Version Location Description Submitted By Date submitted Date corrected
Printed
Page 21
last paragraph

create something a little more advanced then "Hello, World"
should read:
create something a little more advanced than "Hello, World"

Anonymous   
Printed
Page 27
3rd

install libwww-perl

returns with

from ppm (version 3.0.1)

perl -v gives.

"This is perl, v5.8.0 built for MSWin32-x86-multi-thread
(with 1 registered patch, see perl -V for more detail)

Copyright 1987-2002, Larry Wall

Binary build 802 provided by ActiveState Corp. http://www.ActiveState.com
Built 00:54:02 Nov 8 2002"

solution: goto http://www.cpan.org/modules/INSTALL.html and follow instructions to install manually.

AUTHOR: This is technically a mistake, but only because the user is using ppm3 and
not ppm (which is version 2, as indicated in output of that hack).

Anonymous   
Printed
Page 28
last command exampe

%perl -MWLP::Simple -e 'print join "
", head "http://cpan.org/RECENT"';
should be:
%perl -MWLP::Simple -e "print join '
', head 'http://cpan.org/RECENT'";

Anonymous   
Printed
Page 32
2nd last paragraph, first word

h1=en
should read:
hl=en

i.e. the letter "l", not the number "1".
Ditto in the last paragraph, where h1 is also used.

Anonymous   
Printed
Page 45
Bold code at bottom of page

if (my $encoding = $response->content_encoding) ) {
Should be
if (my $encoding = $response->content_encoding) {

There is an extra " ) "

Anonymous   
Printed
Page 50
foreach loop in Progress Bar script

The value of $final_data is not re-initialized upon each iteration of the foreach
loop over the @ARGV array. Consequently, the value of length($final_data) is inflated
for the second and subsequent URLs specified on the command line and the progress
reported is incorrect.
AUTHOR:This is correct. The script and hack was tested with only
one file at a time, even though it supported more than one command line.

Anonymous   
Printed
Page 53
2nd code fragment

In the example code, 2nd fragment, an extraneous character is indicated in the search
string. It currently reads as follows:

my @links = $p->look_down( _tag => 'a', href => qr{^
Ohttp://www.oreilly.com/catalog/E w+ $}x );

and should read:

my @links = $p->look_down( _tag => 'a', href => qr{^
http://www.oreilly.com/catalog/E w+ $}x );

Anonymous   
Printed
Page 53
Hack #19 Scraping with HTML::TreeBuilder

The last para on this page referes to "O'Reilly's subscription-based Safari online Library
http://safari.online.com
Should be:
http://safari.oreilly.com

Anonymous   
Printed
Page 63
2nd paragraph

Andy Lester's WWW::Mechanize [Hack #22] allows you to go to a URL and explore the sit...
should say:
Andy Lester's WWW::Mechanize [Hack #21] allows you to go to a URL and explore the sit...

Anonymous   
Printed
Page 169
google search code block

The entire doGoogleSearch call appears to be missing.

Anonymous   
Printed
Page 273
3rd paragraph

Kurt Hindenburg's tvlisting no longer works and he closed development of it as per his website.

Anonymous   
Printed
Page 277
hack 74 introduction

...idea what you're visitor's weather is like.
should be
...idea what your visitor's weather is like.

Anonymous   
Printed
Page 326
function getBlock (lower half of page)

the variable $pElement is referenced in the getBlock function, but is never declared
or assigned in this function. This appears to cause the function to fail under some
conditions.

if( $_start > strlen($pElement) && $_stop > $start ) }
should read:
if( $_start < strlen($pSource) && $_stop > $start ) }

Anonymous   
Printed
Page 364
6 Hack #94 Using XML::RSS to Repurpose Data

http://www.newsmonster.com
should be
http://www.newsmonster.org

Anonymous