12.4. Parsing XML with SAX
Problem
You want to parse an XML document and format it on an event basis, such as when the parser encounters a new opening or closing element tag. For instance, you want to turn an RSS feed into HTML.
Solution
Use the parsing functions in PHP’s XML extension:
$xml = xml_parser_create(); $obj = new Parser_Object; // a class to assist with parsing xml_set_object($xml,$obj); xml_set_element_handler($xml, 'start_element', 'end_element'); xml_set_character_data_handler($xml, 'character_data'); xml_parser_set_option($xml, XML_OPTION_CASE_FOLDING, false); $fp = fopen('data.xml', 'r') or die("Can't read XML data."); while ($data = fread($fp, 4096)) { xml_parse($xml, $data, feof($fp)) or die("Can't parse XML data"); } fclose($fp); xml_parser_free($xml);
Discussion
These XML parsing functions require the
expat
library. However, because Apache 1.3.7
and later is bundled with expat
, this library is
already installed on most machines. Therefore, PHP enables these
functions by default, and you don’t need to
explicitly configure PHP to support XML.
expat
parses XML documents and allows you to
configure the parser to call functions when it encounters different
parts of the file, such as an opening or closing element tag or
character data (the text between tags). Based on the tag name, you
can then choose whether to format or ignore the data. This is known
as event-based
parsing
and contrasts with DOM XML, which use a
tree-based parser.
A popular API for event-based ...
Get PHP Cookbook now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.