May 2017
Beginner to intermediate
220 pages
5h 2m
English
Now, we can build the actual crawling and scraping code, known as a spider in Scrapy. An initial template can be generated with the genspider command, which takes the name you want to call the spider, the domain, and an optional template:
$ scrapy genspider country example.webscraping.com --template=crawl
We used the built-in crawl template which utilizes the Scrapy library's CrawlSpider. A Scrapy CrawlSpider has special attributes and methods available when crawling the web rather than a simple scraping spider.
After running the genspider command, the following code is generated in example/spiders/country.py:
# -*- coding: utf-8 -*-import scrapy from scrapy.linkextractors import LinkExtractor from scrapy.spiders import ...
Read now
Unlock full access