Parsing XML
Say you have a collection of books written
in XML, and you want to build an index showing the document title and
its author. You need to parse the XML files to recognize the
title and author elements and
their contents. You could do this by hand with regular expressions
and string functions such as strtok( ), but
it’s a lot more complex than it seems. The easiest
and quickest solution is to use the XML parser that ships with PHP.
PHP’s XML parser is based on the Expat C library, which lets you parse but not validate XML documents. This means you can find out which XML tags are present and what they surround, but you can’t find out if they’re the right XML tags in the right structure for this type of document. In practice, this isn’t generally a big problem.
PHP’s XML parser is event-based, meaning that as the parser reads the document, it calls various handler functions you provide as certain events occur, such as the beginning or end of an element.
In the following sections we discuss the handlers you can provide, the functions to set the handlers, and the events that trigger the calls to those handlers. We also provide sample functions for creating a parser to generate a map of the XML document in memory, tied together in a sample application that pretty-prints XML.
Element Handlers
When the
parser encounters the beginning or end of an element, it calls the
start and end element handlers. You set
the handlers through the xml_set_element_handler( )
function: ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access