WordprocessingML

Beginning with Microsoft Office 2003 for Windows (but not Office 2004 for the Mac), Microsoft gave Word and the other Office components the ability to save all documents in XML, although by default it still picks a binary format. The XML application saved by Microsoft Word is named WordprocessingML. Unlike DocBook, TEI, and OpenOffice, all of which were designed from scratch without any legacy issues, WordprocessingML was designed more as an XML representation of an existing binary file format. This makes it a rather unusual example of a narrative document format. We would not recommend that you emulate its design in your own applications. Nonetheless, it can be educational to compare it to the other three formats.

Example 6-4 shows the same document as in the previous three examples, this time encoded in WordprocessingML. The WordprocessingML version seems the most opaque and cryptic of the four formats discussed in this chapter. This example makes it pretty obvious that XML is not magic pixie dust you can sprinkle on an existing format to create clean, legible, maintainable data.

The root element of a WordprocessingML document is w:wordDocument . Here, the w prefix is mapped to the namespace URI http://schemas.microsoft.com/office/word/2003/wordml. Several other namespaces are declared for different content that can be embedded in a Word file.

This root element can contain several different chunks of metadata. Here I’ve used three: o:DocumentProperties for basic ...

Get XML in a Nutshell, 3rd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.