Book description
The Internet is bigger and better than what a mere browser allows. Webbots, Spiders, and Screen Scrapers is for programmers and businesspeople who want to take full advantage of the vast resources available on the Web. There's no reason to let browsers limit your online experience-especially when you can easily automate online tasks to suit your individual needs.
Learn how to write webbots and spiders that do all this and more:
Programmatically download entire websites
Effectively parse data from web pages
Manage cookies
Decode encrypted files
Automate form submissions
Send and receive email
Send SMS alerts to your cell phone
Unlock password-protected websites
Automatically bid in online auctions
Exchange data with FTP and NNTP servers
Sample projects using standard code libraries reinforce these new skills. You'll learn how to create your own webbots and spiders that track online prices, aggregate different data sources into a single web page, and archive the online data you just can't live without. You'll learn inside information from an experienced webbot developer on how and when to write stealthy webbots that mimic human behavior, tips for developing fault-tolerant designs, and various methods for launching and scheduling webbots. You'll also get advice on how to write webbots and spiders that respect website owner property rights, plus techniques for shielding websites from unwanted robots.
As a bonus, visit the author's website to test your webbots on sample target pages, and to download the scripts and code libraries used in the book.
Some tasks are just too tedious-or too important!- to leave to humans. Once you've automated your online life, you'll never let a browser limit the way you use the Internet again.
Table of contents
-
Webbots, Spiders, and Screen Scrapers
- ACKNOWLEDGMENTS
- Introduction
-
I. FUNDAMENTAL CONCEPTS AND TECHNIQUES
- 1. WHAT'S IN IT FOR YOU?
- 2. IDEAS FOR WEBBOT PROJECTS
- 3. DOWNLOADING WEB PAGES
- 4. PARSING TECHNIQUES
- 5. AUTOMATING FORM SUBMISSION
- 6. MANAGING LARGE AMOUNTS OF DATA
-
II. PROJECTS
- 7. PRICE-MONITORING WEBBOTS
- 8. IMAGE-CAPTURING WEBBOTS
- 9. LINK-VERIFICATION WEBBOTS
- 10. ANONYMOUS BROWSING WEBBOTS
- 11. SEARCH-RANKING WEBBOTS
- 12. AGGREGATION WEBBOTS
- 13. FTP WEBBOTS
- 14. NNTP NEWS WEBBOTS
- 15. WEBBOTS THAT READ EMAIL
- 16. WEBBOTS THAT SEND EMAIL
- 17. CONVERTING A WEBSITE INTO A FUNCTION
-
III. ADVANCED TECHNICAL CONSIDERATIONS
- 18. SPIDERS
- 19. PROCUREMENT WEBBOTS AND SNIPERS
- 20. WEBBOTS AND CRYPTOGRAPHY
- 21. AUTHENTICATION
- 22. ADVANCED COOKIE MANAGEMENT
- 23. SCHEDULING WEBBOTS AND SPIDERS
-
IV. LARGER CONSIDERATIONS
- 24. DESIGNING STEALTHY WEBBOTS AND SPIDERS
- 25. WRITING FAULT-TOLERANT WEBBOTS
- 26. DESIGNING WEBBOT-FRIENDLY WEBSITES
- 27. KILLING SPIDERS
- 28. KEEPING WEBBOTS OUT OF TROUBLE
-
A. PHP/CURL REFERENCE
- Creating a Minimal PHP/CURL Session
- Initiating PHP/CURL Sessions
-
Setting PHP/CURL Options
- CURLOPT_URL
- CURLOPT_RETURNTRANSFER
- CURLOPT_REFERER
- CURLOPT_FOLLOWLOCATION and CURLOPT_MAXREDIRS
- CURLOPT_USERAGENT
- CURLOPT_NOBODY and CURLOPT_HEADER
- CURLOPT_TIMEOUT
- CURLOPT_COOKIEFILE and CURLOPT_COOKIEJAR
- CURLOPT_HTTPHEADER
- CURLOPT_SSL_VERIFYPEER
- CURLOPT_USERPWD and CURLOPT_UNRESTRICTED_AUTH
- CURLOPT_POST and CURLOPT_POSTFIELDS
- CURLOPT_VERBOSE
- CURLOPT_PORT
- Executing the PHP/CURL Command
- Closing PHP/CURL Sessions
- B. STATUS CODES
- C. SMS EMAIL ADDRESSES
- About the Author
- Colophon
Product information
- Title: Webbots, Spiders, and Screen Scrapers
- Author(s):
- Release date: March 2007
- Publisher(s): No Starch Press
- ISBN: 9781593271206
You might also like
book
Mac OS X Internals: A Systems Approach
Mac OS X was released in March 2001, but many components, such as Mach and BSD, …
video
Gatsby JS: Build PWA Blog with GraphQL, React and WordPress
Gatsby JS is a free and open source framework based on React that helps developers build …
book
Map Scripting 101
Map Scripting 101 uses a project-based approach to teach you how to create useful and fun …
book
Wicked Cool Shell Scripts, 2nd Edition
Shell scripts are an efficient way to interact with your machine and manage your files and …

