Skip to Content
Spidering Hacks
book

Spidering Hacks

by Morbus Iff, Tara Calishain
October 2003
Beginner to intermediate
428 pages
11h 9m
English
O'Reilly Media, Inc.
Content preview from Spidering Hacks

Chapter 5. Maintaining Your Collections

Hack 96. Hacks #90-93

It’s rare that one script will solve all your data-grubbing needs. You might want, weekly, to know about new movies being listed on Amazon.com, grab a summary page from IMDB, find the last five movies that each actor or actress starred in, and then image-search and download pictures of them. On the other hand, you might be graphing important information and need to automatically grab the data every hour, day, or week. And what if you’re downloading or mirroring data with wget [Hack #26]?

We have these great tools to automate our information needs, but how do we then automate the running of said tools? Where is our meta-automation?

Hack #90. Using cron to Automate Tasks

Run scripts on a repetitive basis with the cron utility.

There will come a time when you’ve created a script so perfect for your day-to-day life that it becomes absolutely imperative to run on a regular basis. Sure, you could run it manually during your morning routine, but if you can automate the retrieval of data with scraping, why not automate the execution too?

Meet cron, a Unix utility whose life revolves around running things every minute, hour, day, week, month, or year. Give it a command or script and a schedule and let it go. Each user on the system can automate his own tasks with no restrictions: hear the date spoken every minute, have a backup performed every three days at 12:15, or automatically open your email every day at 7:00 A.M. and then again ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

QuickBooks 2005: The Missing Manual

QuickBooks 2005: The Missing Manual

Bonnie Biafore

Publisher Resources

ISBN: 0596005776