Canonical XML

Although infosets are a good idea, they are only abstract formulations of the information in an XML document. So, without reducing an XML document to its infoset, how can you actually approach the goal of being able to actually compare XML documents byte by byte?

It turns out that there is a way: You can use canonical XML. Canonical XML is a companion standard to XML, and you can read all about it at Essentially, canonical XML is a strict XML syntax; documents in canonical XML can be compared directly. The information included in the canonical XML version of a document is the same as would appear in its infoset.

As you can imagine, two XML documents that actually contain the same information can ...

Get Inside XML now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.