O'Reilly logo

Learning Scrapy by Dimitrios Kouzis-Loukas

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Extending beyond middlewares

This section is here for the curious reader more than the practitioner. You certainly don't need to know these in order to write basic/intermediate Scrapy extensions.

If you have a look at scrapy/settings/default_settings.py you will see quite a few class names among the default settings. Scrapy extensively uses a dependency-injection-like mechanism that allows us to customize and extend many of its internal objects. For example, one may want to support more protocols for URLs beyond file, HTTP, HTTPS, S3, and FTP that are defined in the DOWNLOAD_HANDLERS_BASE setting. To do so, one has to just create a Download Handler class and add a mapping in the DOWNLOAD_HANDLERS setting. The most difficult part is to discover ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required