O'Reilly logo

Learning Scrapy by Dimitrios Kouzis-Loukas

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

What Scrapy is not

Finally, it's easy to misunderstand what Scrapy can do for you mainly because the terms Data Scraping and all the related terminology is somewhat fuzzy, and many terms are used interchangeably. I will try to clarify some of these areas to prevent confusion and save you some time.

Scrapy is not Apache Nutch, that is, it's not a generic web crawler. If Scrapy visits a website it knows nothing about, it won't be able to make anything meaningful out of it. Scrapy is about extracting structured information, and requires manual effort to set up the appropriate XPath or CSS expressions. Apache Nutch will take a generic page and extract information, such as keywords, from it. It might be more suitable for some applications and less for ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required