Chapter 21. XML


The Extensible Markup Language, or XML, is a portable, human-readable format for exchanging text or data between programs. XML derives from its parent standard SGML, as does the HTML language used on web pages worldwide. XML, then, is HTML’s younger but more capable sibling. And since most developers know at least a bit of HTML, parts of this discussion will be couched in terms of comparisons with HTML. XML’s lesser-known grandparent is IBM’s GML (General Markup Language), and one of its cousins is Adobe FrameMaker’s Maker Interchange Format (MIF). Example 21-1 depicts the family tree.

XML’s ancestry

Figure 21-1. XML’s ancestry

One way of thinking about XML is that it’s HTML cleaned up, consolidated, and with the ability to define your own tags. It’s HTML with tags that can and should identify the informational content as opposed to the formatting. Another way of perceiving XML is as a general interchange format for such things as business-to-business communications over the Internet, or as a human-editable[50] description of things as diverse as word-processing files and Java documents. XML is all these things, depending on where you’re coming from as a developer and where you want to go today -- and tomorrow.

Because of the wide acceptance of XML, it is used as the basis for many other formats, including the Open Office ( save file format, ...

Get Java Cookbook now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.