Getting the Input

Besides the reasons already mentioned for not going into how a processor works, there is an even better reason not to spend time on the issue: the inputs and outputs of the processor are far more interesting! You have seen how to parse a document incrementally with the SAX interfaces and classes. You can easily make decisions within the process about what to do with the elements encountered, how to handle particular attributes, and what actions error conditions should result in. However, there are some problems with using that model in various situations, and providing input to an XSLT processor is one of them.

SAX Is Sequential

The sequential model that SAX provides does not allow for random access to an XML document. In other words, in SAX you get information about the XML document as the parser does, and lose that information when the parser does. When element 2 comes along, it cannot access information in element 4, because element 4 hasn’t been parsed yet. When element 4 comes along, it can’t “look back” on element 2. Certainly, you have every right to save the information encountered as the process moves along; coding all these special cases can be very tricky, though. The other more extreme option is to build an in-memory representation of the XML document. We will see in a moment that a Document Object Model parser does exactly that for us, so performing the same task in SAX would be pointless, and probably slower and more difficult.

SAX Siblings

Another difficult ...

Get Java and XML now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.