Programming Interfaces for XML: DOM, SAX, and Others
The two most popular APIs used to parse XML documents are the Document Object Model (DOM) and the Simple API for XML (SAX). DOM is an official recommendation of the W3C (available at http://www.w3.org/TR/REC-DOM-Level-1), while SAX is a de facto standard created by David Megginson and others on the XML-DEV mailing list (http://lists.xml.org/archives). We’ll discuss these two APIs briefly here. We won’t use them much in this book, but learning more about them will give you some insight into how most XSLT processors work.
Tip
See http://www.saxproject.org/ for the SAX standard. If you’d like to learn more about the XML-DEV mailing list, send email to xml-dev-subscribe@lists.xml.org. You can also check out http://lists.xml.org/archives/xml-dev/ to see the XML-DEV mailing list archives.
DOM
DOM is designed to build a tree view of your document. Remember that all XML documents must be contained in a single element. That single element then becomes the root of the tree. The DOM specification defines several language-neutral interfaces, described here:
Node
This interface is the base datatype of the DOM.
Document
,Element
,Attr
,Text
,Comment
, andProcessingInstruction
all extend theNode
interface.Document
This object contains the DOM representation of the XML document. Given a
Document
object, you can get the root of the tree (theDocument
element); from the root, you can move through the tree to find all elements, attributes, text, comments, ...
Get XSLT, 2nd Edition now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.