Skip to Main Content
SAX2
book

SAX2

by David Brownell
January 2002
Intermediate to advanced content levelIntermediate to advanced
240 pages
6h 58m
English
O'Reilly Media, Inc.
Content preview from SAX2

Character Information Items

Along with element and attribute information items, characters are one of the core types of information used by XML applications. SAX2 reports characters in groups, rather than one at a time.

Property

Callbacks

Explanation

[character code]

ContentHandler.characters(), ContentHandler.ignorableWhitespace()

These calls provide one or more characters in the UTF-16 encoding. Normally, each Java char is a single [character code], but surrogate pairs are used to encode characters from the “Astral Planes,” which don’t fit into 16 bits. (No whitespace characters need surrogate pairs.)

[element content whitespace]

When known, this Boolean property is encoded by using the ignorableWhitespace() callback instead of characters(). Most SAX parsers report this property even when they aren’t validating, though that’s not required. (If any external parameter entities are skipped, it is not possible to reliably provide this information.)

[parent]

Applications must keep track of this information item if it is needed.

SAX2 permits reporting of a character property that the XML Infoset doesn’t address: whether the characters are in a CDATA section. (DOM requires this information.) Such section boundaries are reported using methods in the LexicalHandler class.

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Special Edition Using XML, Second Edition

Special Edition Using XML, Second Edition

- et al. David Gulbransen

Publisher Resources

ISBN: 0596002378Supplemental ContentCatalog PageErrata