Writing your first crawler
Let's start with a very basic crawler that will crawl the entire content of a web page. To write the crawlers, we will use Scrapy. Scrapy is a one of the best crawling solutions using Python. We will explore all the different features of Scrapy in this chapter. First, we need to install Scrapy for this exercise.
To do this, type in the following command:
$ pip install scrapy
This is the easiest way of installing Scrapy using a package manager. Let's now test whether we got everything right or not. (Ideally, Scrapy should now be part of
>>> import scrapy
If there is any error, then take a look at http://doc.scrapy.org/en/latest/intro/install.html.
At this point, we have Scrapy working for you. Let's start ...