Spidering Hacks

Errata for Spidering Hacks

Submit your own errata for this product.


The errata list is a list of errors and their corrections that were found after the product was released. If the error was corrected in a later version or reprint the date of the correction will be displayed in the column titled "Date Corrected".

The following errata were submitted by our customers and approved as valid errors by the author or editor.

Color Key: Serious Technical Mistake Minor Technical Mistake Language or formatting error Typo Question Note Update



Version Location Description Submitted By Date Submitted Date Corrected
Printed
Page 21
last paragraph

create something a little more advanced then "Hello, World" should read: create something a little more advanced than "Hello, World"

Anonymous   
Printed
Page 27
3rd

install libwww-perl returns with from ppm (version 3.0.1) perl -v gives. "This is perl, v5.8.0 built for MSWin32-x86-multi-thread (with 1 registered patch, see perl -V for more detail) Copyright 1987-2002, Larry Wall Binary build 802 provided by ActiveState Corp. http://www.ActiveState.com Built 00:54:02 Nov 8 2002" solution: goto http://www.cpan.org/modules/INSTALL.html and follow instructions to install manually. AUTHOR: This is technically a mistake, but only because the user is using ppm3 and not ppm (which is version 2, as indicated in output of that hack).

Anonymous   
Printed
Page 28
last command exampe

%perl -MWLP::Simple -e 'print join " ", head "http://cpan.org/RECENT"'; should be: %perl -MWLP::Simple -e "print join ' ', head 'http://cpan.org/RECENT'";

Anonymous   
Printed
Page 32
2nd last paragraph, first word

h1=en should read: hl=en i.e. the letter "l", not the number "1". Ditto in the last paragraph, where h1 is also used.

Anonymous   
Printed
Page 45
Bold code at bottom of page

if (my $encoding = $response->content_encoding) ) { Should be if (my $encoding = $response->content_encoding) { There is an extra " ) "

Anonymous   
Printed
Page 50
foreach loop in Progress Bar script

The value of $final_data is not re-initialized upon each iteration of the foreach loop over the @ARGV array. Consequently, the value of length($final_data) is inflated for the second and subsequent URLs specified on the command line and the progress reported is incorrect. AUTHOR:This is correct. The script and hack was tested with only one file at a time, even though it supported more than one command line.

Anonymous   
Printed
Page 53
2nd code fragment

In the example code, 2nd fragment, an extraneous character is indicated in the search string. It currently reads as follows: my @links = $p->look_down( _tag => 'a', href => qr{^ Ohttp://www.oreilly.com/catalog/E w+ $}x ); and should read: my @links = $p->look_down( _tag => 'a', href => qr{^ http://www.oreilly.com/catalog/E w+ $}x );

Anonymous   
Printed
Page 53
Hack #19 Scraping with HTML::TreeBuilder

The last para on this page referes to "O'Reilly's subscription-based Safari online Library http://safari.online.com Should be: http://safari.oreilly.com

Anonymous   
Printed
Page 63
2nd paragraph

Andy Lester's WWW::Mechanize [Hack #22] allows you to go to a URL and explore the sit... should say: Andy Lester's WWW::Mechanize [Hack #21] allows you to go to a URL and explore the sit...

Anonymous   
Printed
Page 169
google search code block

The entire doGoogleSearch call appears to be missing.

Anonymous   
Printed
Page 273
3rd paragraph

Kurt Hindenburg's tvlisting no longer works and he closed development of it as per his website.

Anonymous   
Printed
Page 277
hack 74 introduction

...idea what you're visitor's weather is like. should be ...idea what your visitor's weather is like.

Anonymous   
Printed
Page 326
function getBlock (lower half of page)

the variable $pElement is referenced in the getBlock function, but is never declared or assigned in this function. This appears to cause the function to fail under some conditions. if( $_start > strlen($pElement) && $_stop > $start ) } should read: if( $_start < strlen($pSource) && $_stop > $start ) }

Anonymous   
Printed
Page 364
6 Hack #94 Using XML::RSS to Repurpose Data

http://www.newsmonster.com should be http://www.newsmonster.org

Anonymous