Chapter 14. Scraping Remotely

That this chapter is the last in the book is somewhat appropriate. Up until now we have been running all the Python applications from the command line, within the confines of our home computers. Sure, you might have installed MySQL in an attempt to replicate the environment of a real-life server. But it’s just not the same. As the saying goes: “If you love something, set it free.”

In this chapter, I’ll cover several methods for running scripts from different machines, or even just different IP addresses on your own machine. Although you might be tempted to put this step off as something you don’t need right now, you might be surprised at how easy it is to get started with the tools you already have (such as a personal website on a paid hosting account), and how much easier your life becomes once you stop trying to run Python scrapers from your laptop.

Why Use Remote Servers?

Although using a remote server might seem like an obvious step when launching a web app intended for use by a wide audience, more often than not the tools we build for our own purposes are just left running locally. People who decide to push onto a remote platform usually base their decision on two primary motivations: the need for greater power and flexibility, and the need to use an alternative IP address.

Avoiding IP Address Blocking

When building web scrapers, the rule of thumb is: everything can be faked. You can send emails from addresses you don’t own, automate ...

Get Web Scraping with Python now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.