Chapter 10. Word 2003 XML Hacks

Introduction: Hacks #90-100

Starting with Word 2003, you have a whole new way to access and process the information in Word documents. If you select FileSave As, you’ll see a new entry under “Save as type” called “XML Document.” XML (Extensible Markup Language) provides a standard way to encode information—data, documents, and everything in between—in a readable text format. It is an interoperable, OS-independent format, which means you can now process and generate Word documents using applications other than Microsoft Word.


All of the hacks in this chapter require either the standard or professional version of Word 2003 for Windows.

XML lets you define your own “document type” or “vocabulary” suited to your particular application or industry. For example, DocBook is an XML vocabulary used extensively for technical publishing. WordprocessingML is Microsoft’s XML format for Word documents. It is a lossless format, which means it contains the same information that’s stored in the .doc format, but in a plain-text XML format rather than a binary format that only a computer can understand. Most of the hacks in this chapter show you how you can use XML to gain powerful control over your Word documents.

Beyond some suggestive examples, this chapter will not spend a lot of time explaining WordprocessingML or how it works (or XML in general). Instead, it focuses on the kinds of things you can do with it, using XSLT (Extensible Stylesheet Language Transformations), ...

Get Word Hacks now with the O’Reilly learning platform.

O’Reilly members experience live online training, plus books, videos, and digital content from nearly 200 publishers.