Skip to Content
Spidering Hacks
book

Spidering Hacks

by Morbus Iff, Tara Calishain
October 2003
Beginner to intermediate
428 pages
11h 9m
English
O'Reilly Media, Inc.
Content preview from Spidering Hacks

Chapter 2. Assembling a Toolbox

Hack 9. Hacks #8-32

The idea behind scraping sites often arises out of pure, immediate, and frantic desire: it’s late at night, you’ve forgotten your son’s soccer game for the twelfth time in a row, and you’re vowing never to let it happen again. Sure, you could place a bookmark to the school calendar in your browser toolbar, but you want something even more insidious, something you couldn’t possibly forget or grow accustomed to seeing.

A bit later, you’ve got a Perl script that automatically emails you every hour of every day that a game is scheduled. You’ve just made your life less forgetful, your computer more useful, and your son more loving. This is where spidering and scraping shines: when you’ve got an itch that can best be scratched by getting your computer involved. And if there’s one programming language that can quickly scratch an itch better than any other, it’s Perl.

Perl is renowned for “making easy things easier and hard things possible,” earning the reputation of “Swiss Army chainsaw,” “Internet duct tape,” or the ultimate “glue language.” Since it’s a scripting language (as opposed to a compiled one, like C), rapid development is its modus operandi; throw together bits and pieces from code here and there, try it out, tweak, hem, haw, and deploy. Along with its immense repository of existing code (see CPAN, the Comprehensive Perl Archive Network, at http://www.cpan.org) and the uncanny ability to “do what you mean,” it’s a perfect language ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

QuickBooks 2005: The Missing Manual

QuickBooks 2005: The Missing Manual

Bonnie Biafore

Publisher Resources

ISBN: 0596005776