May 2017
Beginner to intermediate
220 pages
5h 2m
English
The following table summarizes the advantages and disadvantages of each approach to scraping:
| Scraping approach | Performance | Ease of use | Ease to install |
| Regular expressions | Fast | Hard | Easy (built-in module) |
| Beautiful Soup | Slow | Easy | Easy (pure Python) |
| Lxml | Fast | Easy | Moderately difficult |
If speed is not an issue to you and you prefer to only install libraries via pip, it would not be a problem to use a slower approach, such as Beautiful Soup. Or, if you just need to scrape a small amount of data and want to avoid additional dependencies, regular expressions might be an appropriate choice. However, in general, lxml is the best choice for scraping, because it is fast and robust, while regular expressions and Beautiful ...
Read now
Unlock full access