Book description
Learn web scraping and crawling techniques to access unlimited data from any web source in any format. With this practical guide, you’ll learn how to use Python scripts and web APIs to gather and process data from thousands—or even millions—of web pages at once.
Ideal for programmers, security professionals, and web administrators familiar with Python, this book not only teaches basic web scraping mechanics, but also delves into more advanced topics, such as analyzing raw data or using scrapers for frontend website testing. Code samples are available to help you understand the concepts in practice.
Table of contents
- Preface
- I. Building Scrapers
- 1. Your First Web Scraper
- 2. Advanced HTML Parsing
- 3. Starting to Crawl
- 4. Using APIs
- 5. Storing Data
- 6. Reading Documents
- II. Advanced Scraping
- 7. Cleaning Your Dirty Data
- 8. Reading and Writing Natural Languages
- 9. Crawling Through Forms and Logins
- 10. Scraping JavaScript
- 11. Image Processing and Text Recognition
- 12. Avoiding Scraping Traps
- 13. Testing Your Website with Scrapers
- 14. Scraping Remotely
- A. Python at a Glance
- B. The Internet at a Glance
- C. The Legalities and Ethics of Web Scraping
- Index
Product information
- Title: Web Scraping with Python
- Author(s):
- Release date: July 2015
- Publisher(s): O'Reilly Media, Inc.
- ISBN: 9781491910290
You might also like
book
Robust Python
Does it seem like your Python projects are getting bigger and bigger? Are you feeling the …
book
Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow, 3rd Edition
Through a recent series of breakthroughs, deep learning has boosted the entire field of machine learning. …
book
Python for Data Analysis, 3rd Edition
Get the definitive handbook for manipulating, processing, cleaning, and crunching datasets in Python. Updated for Python …
book
Practical Time Series Analysis
Time series data analysis is increasingly important due to the massive production of such data through …