In order to understand how Request
s/Item
s flow though the pipes, we aren't really able to measure the flows (although this would be a cool feature). Instead, we can easily measure how much liquid, that is, Request
s/Response
s/Item
s, exists in each of Scrapy's processing stages.
Scrapy runs the telnet service via which we can get performance information. We can connect to it by using the telnet command on port 6023
. We then get a Python prompt inside Scrapy. Be careful, if you do something blocking there, such as time.sleep()
, it will halt the crawler's functionality. Several interesting metrics get printed by the built-in est()
function. Some of them are either very specialized or can be deduced from a ...
No credit card required