May 2017
Beginner to intermediate
220 pages
5h 2m
English
To run a spider from the command line, the crawl command is used along with the name of the spider:
$ scrapy crawl country -s LOG_LEVEL=ERROR$
The script runs to completion with no output. Take note of the -s LOG_LEVEL=ERROR flag-this is a Scrapy setting and is equivalent to defining LOG_LEVEL = 'ERROR' in the settings.py file. By default, Scrapy will output all log messages to the terminal, so here the log level was raised to isolate error messages. Here, no output means our spider completed without error -- great!
In order to actually scrape some content from the pages, we need to add a few lines to the spider file. To ensure we can start building and extracting our items, we have to first start using our CountryItem ...
Read now
Unlock full access