Chapter 7. DOM
In this chapter, we return to standard APIs with the Document Object Model (DOM). In Chapter 5, we talked about the benefits of using standard APIs: increased compatibility with other software components and (if implemented correctly) a guaranteed complete solution. The same concept applies in this chapter: what SAX does for event streams, DOM does for tree processing.
DOM and Perl
DOM is a recommendation by the
World Wide Web Consortium
(W3C). Designed to be a language-neutral interface to an in-memory
representation of an XML document, versions of DOM are available in
Java,
ECMAscript,[1] Perl, and other languages. Perl alone
has several implementations of DOM,
including
XML::DOM
and
XML::LibXML
.
While SAX defines an interface of handler methods, the DOM specification calls for a number of classes, each with an interface of methods that affect a particular type of XML markup. Thus, every object instance manages a portion of the document tree, providing accessor methods to add, remove, or modify nodes and data. These objects are typically created by a factory object, making it a little easier for programmers who only have to initialize the factory object themselves.
In DOM, every piece of XML (the element, text, comment, etc.) is a
node represented by a Node
object. The Node
class is extended by more specific classes that represent the types
of XML markup, including Element
,
Attr
(attribute),
ProcessingInstruction
, Comment
,
EntityReference
, Text
,
CDATASection,
Get Perl and XML now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.