What Is Data Binding?

Before starting with the meat of the book, let me give you a basic introduction to data binding and the four concepts that make up a data binding package:

  • Source file/class generation

  • Unmarshalling

  • Marshalling

  • Binding schemas

I’ll focus on each of these over the next several chapters, but I wanted to give you a bit of a preview here. You’ll want to get an idea of the big picture so you can see how these components fit together.

Class Generation

I’ve already mentioned that the basic idea of data binding is to take an XML document and convert it to an instance of a Java object. Furthermore, that Java class is tailored to a business need and generally matches up with the element and attribute naming in the related XML document. Of course, I conveniently skipped over where that class comes from; this is where class generation comes in. In the most common XML data binding scenario, this class is not hand coded (that’s quite a pain, right?). Instead, a data binding tool that will generate this source file (or source files) for you is provided.

In a nutshell, data binding packages allow you to take a set of XML constraints (DTD, XML Schema, etc.) and create a set of Java source files from these constraints. I’ll dive deeper into the specifics of this subject in Chapter 3. In general, it works like this: an element is defined in a DTD called dealer-name, and a Java class called DealerName is generated. An XML Schema defines the servlet element as having an attribute called id and a child element named description, and the resultant Java class (Servlet) has a getId() method as well as a getDescription() method. You get the idea—a mapping is made between the structure laid out by the XML constraint document and a set of Java classes. You can then compile these classes and begin converting between XML and Java.


Once you’ve got your generated classes compiled and on your Java Virtual Machine’s (JVM’s) classpath, you’re ready to convert XML documents to Java classes. This process is called unmarshalling in the data binding world.[2] The process is based on starting with an XML document. This document should conform to the XML constraints used to generate Java classes, referred to in the class generation section. If it doesn’t meet these constraints, you’re going to get errors as elements, attributes, and character data in the XML document won’t match up with the structure of the generated Java classes. Most data binding packages offer an option to validate an XML document before unmarshalling it to ensure you don’t run into this problem. I’ll focus on this and the other details of unmarshalling in Chapter 4.

Lest you think that all of your existing business objects are wasted, it is possible to unmarshal an XML document into an existing Java class (or classes). This is a common scenario when you already have a Java-based application and want to persist some of your objects to XML (like Enterprise JavaBeans or other data-related objects). You can either structure your XML to match your existing Java object hierarchy or use a binding schema (covered later in this chapter). While not all data binding packages support this handy approach to data binding, I’ll spend some time in the later chapters of the book exploring it.


The reverse of the unmarshalling process is marshalling, which converts a Java object into an XML document representation. There’s nothing too revolutionary here that you probably haven’t already guessed. As with unmarshalling, many frameworks offer a validation option on generated Java classes that allows you to validate the data within your Java classes before trying to write them out to XML. That ensures that the resultant XML documents still match up with the constraints used to generate Java classes in the first place. Some extra data carried around by these generated classes—such as the XML names of the related elements, DTD references, and namespace information—also tends to get marshalled to Java. This ensures that the Java classes marshal to XML documents that they are the same as (or as close as possible) the XML documents they came from.

Like unmarshalling, marshalling is a process that is often useful to classes that were not generated by a data binding framework. Like unmarshalling, only some frameworks support marshalling, but those that do can be incredibly useful. Generally, Java classes must follow some rules to be marshalled to XML, such as following the JavaBeans format (each data member has a getXXX() and setXXX() style method). However, if your classes conform to these rules, conversion to XML becomes simple. I’ll focus on the nuts and bolts of marshalling in Chapter 5.

Binding Schemas

The final component of XML data binding is probably the most complex, but also the most powerful. A binding schema specifies details about how classes are generated from XML constraints. In the general case, an element named ejb-jar becomes an object named EjbJar. Some basic rules are applied to ensure legal Java names, but names are otherwise kept as true to the underlying XML as possible. Additionally, constraints such as those found in DTDs don’t have type information applied (everything comes across as PCDATA, which is just character data). However, these basic rules are often not enough to create the Java business objects you want. In these cases, a binding schema can help.

A binding schema allows you to specify type conversions, name transformations, and specification of superclasses for generated objects. It allows the application of a richer set of rules, resulting in objects that more closely model your business needs. I’ll spend all of Chapter 6 talking about this, so don’t get too caught up in the details just yet. However, these binding schemas can allow you to convert XML to your already-coded Java classes, enforce type-checking even when a DTD doesn’t, and a lot more. A binding schema takes data binding tools from trivial utility classes to full-blown persistence packages; all in all, they are the most powerful feature found in data binding packages.

How these schemas actually look and act depends largely (at least at this point in data binding evolution) upon the data binding implementation. Some binding schemas are actual XML Schema-style documents; others look like plain old XML documents. They are almost always represented by a physical XML-style document that is parsed in at the same time as the XML constraint model. It is then up to the data binding package to determine if the binding schema is packaged with generated classes or if the mappings are contained completely within generated source code. All of these details will be covered, for each binding package, in those packages’ respective chapters.

[2] If you forget which way is marshalling and which is unmarshalling, remember that it’s XML data binding. Everything starts and ends with XML, so converting to XML is the “normal” direction, resulting in simple marshalling. Converting from XML is the reverse direction, so you are unmarshalling. For some reason, thinking of it this way keeps me straight.

Get Java & XML Data Binding now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.