2.7. Downloading All Files from a Site
Problem
You need to create a backup, mirror, or offline copy of your web site.
Solution
Use the Unix utility wget
to
mirror the files on the server to another location either by HTTP with
this command:
wget --mirror http://yourwebsite.com
or by FTP:
wget --mirror ftp://username:password@yourwebsite.com
Alternatively, you can use GUI-based utilities on your PC. Some choices are listed in the "See Also" section of this Recipe.
Discussion
With wget
, you can perform
heroic feats of webmastering, whether it's copying a single file from
one site to another, or an entire site to another server.
Warning
When spidering a site over HTTP, wget
will only copy files it finds links
to. Unused images and old web pages still lingering on the server
will be skipped. Using FTP, wget
will copy everything.
Some scenarios where wget
can
be indispensable include:
- Keeping frequently updated pages or images in sync on two sites
Say you want to display a real-time webcam image on your site, but don't want to (or can't) use an absolute URL to the site where the camera saves the image in the image tag's
src
attribute. (Perhaps the other site's server is slower or less reliable than yours, or outside linking to the image has been disabled, as described in Recipe 5.5.) Withwget
, you can specify the URL of the file, a local directory on your server where it should be copied, and the number of times to retry a flaky HTTP connection. Combined withcron
(see Recipe 1.8),wget
can perform ...
Get Web Site Cookbook now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.