Chapter 13. Web Automation
Most of the time, PHP is part of a web server, sending content to browsers. Even when you run it from the command line, it usually performs a task and then prints some output. PHP can also be useful, however, playing the role of a web client, retrieving URLs and then operating on the content. Whereas Chapter 14 discusses retrieving URLs from within PHP, this chapter explores how to process the received content.
Recipes through help you manipulate those page contents. Marking Up a Web Page demonstrates how to mark up certain words in a page with blocks of color. This technique is useful for highlighting search terms, for example. Cleaning up HTML so it’s easier to parse and is standards compliant, is the topic of Cleaning Up Broken or Nonstandard HTML. Extracting Links from an HTML File provides a function to find all the links in a page. This is an essential building block for a web spider or a link checker. Converting between plain text and HTML is covered in Recipes and . Removing HTML and PHP Tags shows how to remove all HTML and PHP tags from a web page.
Two sample programs ...