Canonical XML
Although infosets are a good idea, they are only abstract formulations of the information in an XML document. So, without reducing an XML document to its infoset, how can you actually approach the goal of being able to actually compare XML documents byte by byte?
It turns out that there is a way: You can use canonical XML. Canonical XML is a companion standard to XML, and you can read all about it at http://www.w3.org/TR/xml-c14n. Essentially, canonical XML is a strict XML syntax; documents in canonical XML can be compared directly. The information included in the canonical XML version of a document is the same as would appear in its infoset.
As you can imagine, two XML documents that actually contain the same information can ...
Get Inside XML now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.