SAX
SAX is a low-level, event-style API for parsing XML documents. SAX originated in Java, but has been implemented in many languages. We’ll begin our discussion of the Java XML APIs here at this lower level, and work our way up to higher-level (and often more convenient) APIs as we go.
The SAX API
To use SAX, we’ll draw on classes from the org.xml.sax package,
standardized by the W3C. This package holds interfaces common to all
implementations of SAX. To perform the actual parsing, we’ll need the
javax.xml.parsers
package, which is the standard Java package for accessing XML parsers.
The java.xml.parsers package is part
of the Java API for XML Processing (JAXP), which allows different parser
implementations to be used with Java in a portable way.
To read an XML document with SAX, we first register an org.xml.sax.ContentHandler class with the
parser. The ContentHandler has
methods that are called in response to parts of the document. For
example, the ContentHandler’s
startElement() method
is called when an opening tag is encountered, and the endElement() method is
called when the tag is closed. Attributes are provided with the startElement() call. Text content of elements
is passed through a separate method called characters(). The
characters() method may be invoked
repeatedly to supply more text as it is read, but it often gets the
whole string in one bite. The following are the method signatures of
these methods of the ContentHandler
class.
publicvoidstartElement(Stringnamespace ...