Parsing XML with DOM

Problem

You want to examine an XML file in detail.

Solution

Use DOM to parse the document, and process the resulting in-memory tree.

Discussion

The Document Object Model (DOM) is a tree-structured representation of the information in an XML document. It consists of several interfaces, the most important of which is the node . All are in the package org.w3c.dom , reflecting the influence of the World Wide Web Consortium (http://www.w3.org) in creating and promulgating the DOM. The DOM interfaces are shown in Table 21-1.

Table 21-1. DOM interfaces

Interface

Function

Document

Top-level representation of an XML document

Node

Representation of any node in the XML tree

Element

An XML element

Text

A textual string

You don’t have to implement these interfaces; the parser generates them. When you get to creating or modifying XML documents in Section 21.6, then you can create nodes. But even then there are implementing classes. Parsing an XML document with DOM is syntactically similar to processing a file with XSL, that is, you get a reference to a parser and call its methods with objects representing the input files. The difference is that the parser returns an XML DOM, a tree of objects in memory. Example 21-5 is code that simply parses an XML document.

Example 21-5. XParse.java

import java.io.*; import org.w3c.dom.*; import com.sun.xml.tree.*; import org.xml.sax.SAXException; import org.xml.sax.SAXParseException; /** Parse an XML file using DOM. */ public class ...

Get Java Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.