Chapter 2. The WordprocessingML Vocabulary

Microsoft Office Word 2003 marks the introduction of XML as a native format for Word documents. Any Word document can now be opened in Word and saved as XML, thereby freeing documents from the tyranny of Word’s proprietary .doc format. This new format, called WordprocessingML, opens up a multitude of possibilities for generating and processing Word documents. (Read Chapter 3 first if you want some immediate gratification regarding use cases for WordprocessingML.) This chapter includes a basic introduction to WordprocessingML, along with some general technical observations and guidelines for learning more. It is meant to complement, rather than replace, a detailed investigation of the WordprocessingML schema.


An authoritative and thorough source for learning is the Microsoft-supplied XSD schema for WordprocessingML. The “Microsoft Office 2003 XML Reference Schemas” package has been released under a royalty-free license and includes each of the WordprocessingML schema documents, as well as accompanying documentation. It can be found by starting at

Introduction to WordprocessingML

WordprocessingML is Microsoft’s XML format for Word documents. It’s what you get when you select Save As... and choose “XML Document.” WordprocessingML is a lossless format, which means that it contains all the information that Word needs to re-open a document, just as if it had been saved in the traditional .doc format—all ...

Get Office 2003 XML now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.