Parsing with JAXP and SAX

The first thing you want to do with an XML document is parse it. There are two commonly used approaches to XML parsing: they go by the acronyms SAX and DOM. We’ll begin with SAX parsing; DOM parsing is covered later in the chapter.

SAX is the Simple API for XML. SAX is not a parser, but rather a Java API that describes how a parser operates. When parsing an XML document using the SAX API, you define a class that implements various “event” handling methods. As the parser encounters the various element types of the XML document, it invokes the corresponding event-handler methods you’ve defined. Your methods take whatever actions are required to accomplish the desired task. In the SAX model, the parser converts an XML document into a sequence of Java method calls. The parser doesn’t build a parse tree of any kind (although your methods can do this, if you want). SAX parsing is typically quite efficient and is therefore often your best choice for most simple XML processing tasks. SAX-style XML parsing is known as “push parsing” because the parser “pushes” events to your event handler methods. This is in contrast to more traditional “pull parsing” in which your code “pulls” tokens from a parser.

The SAX API was created by David Megginson (http://www.megginson.com/ ) and is now maintained at http://www.saxproject.org. The Java binding of the SAX API consists of the package org.xml.sax and its subpackages. SAX is a de facto standard but has not been standardized ...

Get Java Examples in a Nutshell, 3rd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.