O'Reilly logo

Learning Scrapy by Dimitrios Kouzis-Loukas

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Getting component utilization using telnet

In order to understand how Requests/Items flow though the pipes, we aren't really able to measure the flows (although this would be a cool feature). Instead, we can easily measure how much liquid, that is, Requests/Responses/Items, exists in each of Scrapy's processing stages.

Scrapy runs the telnet service via which we can get performance information. We can connect to it by using the telnet command on port 6023. We then get a Python prompt inside Scrapy. Be careful, if you do something blocking there, such as time.sleep(), it will halt the crawler's functionality. Several interesting metrics get printed by the built-in est() function. Some of them are either very specialized or can be deduced from a ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required