Before starting with the meat of the book, let me give you a basic introduction to data binding and the four concepts that make up a data binding package:
Source file/class generation
Unmarshalling
Marshalling
Binding schemas
I’ll focus on each of these over the next several chapters, but I wanted to give you a bit of a preview here. You’ll want to get an idea of the big picture so you can see how these components fit together.
I’ve already mentioned that the basic idea of data binding is to take an XML document and convert it to an instance of a Java object. Furthermore, that Java class is tailored to a business need and generally matches up with the element and attribute naming in the related XML document. Of course, I conveniently skipped over where that class comes from; this is where class generation comes in. In the most common XML data binding scenario, this class is not hand coded (that’s quite a pain, right?). Instead, a data binding tool that will generate this source file (or source files) for you is provided.
In a nutshell, data binding packages allow you to take a set of XML
constraints (DTD, XML Schema, etc.) and create a set of Java source
files from these constraints. I’ll dive deeper into
the specifics of this subject in Chapter 3. In
general, it works like this: an element is defined in a DTD called
dealer-name
, and a Java class called
DealerName
is generated. An XML Schema defines the
servlet
element as having an attribute called
id
and a child element named
description
, and the resultant Java class
(Servlet
) has a getId()
method
as well as a getDescription()
method. You get the
idea—a mapping is made between the structure laid out by the
XML constraint document and a set of Java classes. You can then
compile these classes and begin converting between XML and Java.
Once you’ve got your
generated
classes
compiled and on your Java Virtual Machine’s
(JVM’s) classpath, you’re ready to
convert XML documents to Java classes. This process is called
unmarshalling
in the data binding
world.[2] The process is
based on starting with an XML document. This document should conform
to the XML constraints used to generate Java classes, referred to in
the class generation section. If it doesn’t meet
these constraints, you’re going to get errors as
elements, attributes, and character data
in the XML document won’t match up with the
structure of the generated Java classes. Most data binding packages
offer an option to validate an XML document before unmarshalling it
to ensure you don’t run into this problem.
I’ll focus on this and the other details of
unmarshalling in Chapter 4.
Lest you think that all of your existing business objects are wasted, it is possible to unmarshal an XML document into an existing Java class (or classes). This is a common scenario when you already have a Java-based application and want to persist some of your objects to XML (like Enterprise JavaBeans or other data-related objects). You can either structure your XML to match your existing Java object hierarchy or use a binding schema (covered later in this chapter). While not all data binding packages support this handy approach to data binding, I’ll spend some time in the later chapters of the book exploring it.
The reverse of the
unmarshalling
process is marshalling
, which converts a Java
object into an XML document representation. There’s
nothing too revolutionary here that you probably
haven’t already guessed. As with unmarshalling, many
frameworks offer a validation option on generated Java classes that
allows you to validate the data within your Java classes before
trying to write them out to XML. That ensures that the resultant XML
documents still match up with the constraints used to generate Java
classes in the first place. Some extra data carried around by these
generated classes—such as the XML names of the related
elements, DTD references, and namespace information—also tends
to get marshalled to Java. This ensures that the Java classes marshal
to XML documents that they are the same as (or as close as possible)
the XML documents they came from.
Like unmarshalling, marshalling is a process that is often useful to
classes that were not generated by a data binding framework. Like
unmarshalling, only some frameworks support marshalling, but those
that do can be incredibly useful. Generally, Java classes must follow
some rules to be marshalled to XML, such as following the JavaBeans
format (each data member has a getXXX()
and
setXXX()
style method). However, if your classes
conform to these rules, conversion to XML becomes simple.
I’ll focus on the nuts and bolts of marshalling in
Chapter 5.
The final
component of XML data binding is
probably the most complex, but also the most powerful. A
binding schema
specifies details about how
classes are generated from XML
constraints. In the general case, an element
named ejb-jar
becomes
an object named EjbJar
. Some basic rules are
applied to ensure legal Java names, but names are otherwise kept as
true to the underlying XML as possible. Additionally, constraints
such as those found in DTDs don’t have type
information applied (everything comes across as
PCDATA
, which is just character data). However,
these basic rules are often not enough to create the Java business
objects you want. In these cases, a binding schema can help.
A binding schema allows you to specify type conversions, name transformations, and specification of superclasses for generated objects. It allows the application of a richer set of rules, resulting in objects that more closely model your business needs. I’ll spend all of Chapter 6 talking about this, so don’t get too caught up in the details just yet. However, these binding schemas can allow you to convert XML to your already-coded Java classes, enforce type-checking even when a DTD doesn’t, and a lot more. A binding schema takes data binding tools from trivial utility classes to full-blown persistence packages; all in all, they are the most powerful feature found in data binding packages.
How these schemas actually look and act depends largely (at least at this point in data binding evolution) upon the data binding implementation. Some binding schemas are actual XML Schema-style documents; others look like plain old XML documents. They are almost always represented by a physical XML-style document that is parsed in at the same time as the XML constraint model. It is then up to the data binding package to determine if the binding schema is packaged with generated classes or if the mappings are contained completely within generated source code. All of these details will be covered, for each binding package, in those packages’ respective chapters.
[2] If you forget which way is marshalling and which is unmarshalling, remember that it’s XML data binding. Everything starts and ends with XML, so converting to XML is the “normal” direction, resulting in simple marshalling. Converting from XML is the reverse direction, so you are unmarshalling. For some reason, thinking of it this way keeps me straight.
Get Java & XML Data Binding now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.