Chapter 7. DOM
In this chapter, we return to standard APIs with the Document Object Model (DOM). In Chapter 5, we talked about the benefits of using standard APIs: increased compatibility with other software components and (if implemented correctly) a guaranteed complete solution. The same concept applies in this chapter: what SAX does for event streams, DOM does for tree processing.
DOM and Perl
DOM is a recommendation by the
World Wide Web Consortium
(W3C). Designed to be a language-neutral interface to an in-memory
representation of an XML document, versions of DOM are available in
Java,
ECMAscript,[1] Perl, and other languages. Perl alone
has several implementations of DOM,
including
XML::DOM and
XML::LibXML.
While SAX defines an interface of handler methods, the DOM specification calls for a number of classes, each with an interface of methods that affect a particular type of XML markup. Thus, every object instance manages a portion of the document tree, providing accessor methods to add, remove, or modify nodes and data. These objects are typically created by a factory object, making it a little easier for programmers who only have to initialize the factory object themselves.
In DOM, every piece of XML (the element, text, comment, etc.) is a
node represented by a Node
object. The Node
class is extended by more specific classes that represent the types
of XML markup, including Element,
Attr (attribute),
ProcessingInstruction, Comment,
EntityReference, Text,
CDATASection,