XML Schema

XML Schema is a new working draft at the W3C that seeks to remedy many of the problems and limitations of DTDs. In addition to handling more accurate representations of XML structure constraints, XML Schema also seeks to provide an XML styling to the process of constraining data. Schemas are actually XML documents that are both well-formed and valid. This allows parsers and other XML-aware applications to handle XML Schema documents in a fashion similar to other XML documents, as opposed to employing special techniques as are needed for handling DTD documents.

Because XML Schema is both a new and young specification, as well as still incomplete, we will only lightly treat it here. In addition, details of the implementation of XML Schema are subject to change; if you have problems with some of the examples, you may want to consult the latest version of the XML Schema proposal at http://www.w3.org/TR/xmlschema-1/ andhttp://www.w3.org/TR/xmlschema-2/. You should also be aware that many XML parsers do not support XML Schema, or support only portions of the specification. You should check with your vendor to verify the level of XML Schema support provided by your XML parser.

There is also a difference between a valid document and a schema-valid document. Because XML Schema is not part of the XML 1.0 specification, a document that conforms to a given schema is not said to be valid. Only an XML document conforming to a referenced DTD through a DOCTYPE declaration is considered a valid XML document. This has caused quite a bit of confusion in the XML community as to how to handle schema validation. In addition to the difference in terms of validity, an XML 1.0 parser or application does not have to perform schema validation, again because XML Schema is not in the 1.0 specification of XML. This means that even if your document has a schema reference, the document may not be validated against that schema, regardless of the parser’s level of schema support. For these reasons, you should take care to determine when your parser will and will not validate, and specifically how it handles schema validation. For clarity, we will continue to use validity as the single term, representing either schema or DTD validity. It will be up to you to see whether a DOCTYPE declaration or a schema reference exists; in addition, the meaning of the word will be clear from the context in which it is used. Any possible ambiguities will be expressly defined and handled in the appropriate portion of the text.

The most significant aspect of creating a schema for your XML document is that you will actually be creating another XML document. Unlike DTDs, which use an entirely different format for specification of elements and definition of attributes, a schema is simply an XML document. For this reason, the syntax will be largely the same as we have already discussed in Chapter 2. Interestingly enough, XML Schema itself is constrained by a DTD. If this seems a little strange to you, consider that until XML Schema, DTDs were the only means of creating document constraints. For XML Schema to enforce validity, it must use a mechanism other than itself to define its own constraints. This other mechanism, then, must be a DTD. However, that initial DTD allows the creating of a schema, which allows all other XML documents to completely disregard DTDs. This rather odd flow of logic is not unusual in the world of specifications and evolving versions; new versions must be shaped by old versions.

The Schema Namespace

You should expect XML Schema documents to begin with a standard XML declaration, and then to refer to the XML Schema namespace. This is exactly correct. In addition, there are standards for the naming of the root element. The accepted practice is to always use schema as the root element of XML Schema documents, and we will not deviate from that standard here. When we specify the root element, we also need to make some namespace definitions, much as we did in our original XML document. The first thing needed is the default namespace declaration:

<xsd:schema xmlns:xsd="http://www.w3.org/1999/XMLSchema" >

We briefly discussed this in Chapter 2; omitting an identifier after the xmlns attribute results in a default namespace being applied to the document. In our original XML document, our namespace definition was specifically for the JavaXML namespace:

<JavaXML:Book xmlns:JavaXML="http://www.w3.org/1999/XMLSchema" >

This told the XML parser that all elements prefixed with JavaXML belonged to that namespace, associated with the given URL. In our XML document, that was all elements, as all elements had this namespace prefix. However, we could also have had additional elements within the document that were not prefixed with a namespace. Elements without a prefix don’t simply disappear; they too must be assigned to a namespace. These would be considered part of the default namespace, which is not defined in the document. It could be defined with an additional namespace declaration in our root element:

<JavaXML:Book xmlns:JavaXML="http://www.oreilly.com/catalog/javaxml" 
              xmlns="http://www.someOtherUrl.com"
>

This would result in any element not prefixed with JavaXML or another namespace prefix being associated with the default namespace, identified by the URL http://www.someOtherUrl.com. So in the following document fragment, Book, Contents, and Title are associated with the JavaXML namespace, while element1 and element2 are associated with the default namespace:

<JavaXML:Book xmlns:JavaXML="http://www.oreilly.com/catalog/javaxml"
              xmlns="http://www.someOtherUrl.com"
>
  <JavaXML:Title>My Title</JavaXML:Title>
  <JavaXML:Contents>
    <element1>
               
      <element2 />
               
    </element1>
  </JavaXML:Contents>

</JavaXML:Book>

Because our schema will be dealing with another document, all elements specifically related to XML Schema constructs should be part of the default namespace. For this reason, we included the default namespace definition. However, these element constructs are acting upon the namespace within the constrained XML document. In other words, although XML Schema constructs are part of the XML Schema namespace, they are used to constrain elements in other namespaces, namely those of the XML document or documents they operate upon. In our continuing example, that would be the JavaXML namespace. So we need to add this additional namespace definition to our schema element:

<schema xmlns="http://www.w3.org/1999/XMLSchema" 
        xmlns:JavaXML="http://www.oreilly.com/catalog/javaxml"
>

Finally, we need to let our schema know that the target of its constraints is on this second namespace. To do that, the targetNamespace attribute is specified, which does exactly what it implies:

<schema targetNamespace="http://www.oreilly.com/catalog/javaxml" 
        xmlns="http://www.w3.org/1999/XMLSchema"
        xmlns:JavaXML="http://www.oreilly.com/catalog/javaxml"
>

So we end up with two namespaces defined (the default and JavaXML), and the target of the constraints set forth in the document being associated with the latter namespace (JavaXML). And with our root element defined, we are ready to begin setting constraints on this namespace. Also keep in mind that it is possible, in the world of HTTP and web servers, that the URL referred to in a namespace might actually be a valid URL; in our example, you could type http://www.oreilly.com/catalog/javaxml into your web browser and get an HTML response. However, the document returned is not being used here; in fact, the URL itself does not have to be accessible, but instead is only used as an association for the declared namespace. This has caused quite a bit of confusion, so don’t get tripped up by what the URI specified is; instead, focus on the namespace being declared and how that namespace is used in the document.

A short note is in order before continuing. This may seem a tough section to read through; if so, don’t feel as if you aren’t up to the task. The concepts involved in XML Schema are not trivial, and the specification is continuing to evolve. Although many content authors will use XML Schema, you are now learning to understand it; this subtle but important difference will result in more intelligent design choices and better applications. Of particular complexity is how DTDs and namespaces are used within schemas; happily, many of the constructs for constraining XML are more straightforward. So take heart, read slowly and with caffeine nearby, and continue on! It will be worth the time and effort in the long run.

Specifying Elements

We have come a long way since you first saw this heading in the section on DTDs. In a schema, specifying an element will feel quite a bit more logical. It also closely mirrors the structure, if not the syntax, of a Java declaration, with some additional options that can be specified. The element element is used for these specifications:

<element name="[Name of Element]" 
         type="[Type of Element]"
         [Options...]
>

Here, [Name of Element] is the name of the element in the XML document being constrained. However, unlike DTDs, the namespace of the element should not prefix the element. Remember our discussion of the target namespace? Because we have said that our target namespace is JavaXML, all element specifications, as well as any user-defined types we create, are applied and assigned to that target namespace. This also aids in creating a cleaner schema, as the elements are defined and then the namespace applied. [Type of Element] is either a predefined XML Schema data type or a user-defined data type. Table 4.4 lists the data types supported by the current version of XML Schema.

Table 4-4. XML Schema Data Types

Type

Subtypes

Purpose

string
NMTOKEN, language

Character strings

boolean

N/A

Binary valued logic (true or false)

float

N/A

32-bit floating point type

double

N/A

64-bit floating point type

decimal
integer

Standard decimal notation, positive and negative

timeInstant

N/A

A combination of date and time representing one single instant of time

timeDuration

N/A

A duration of time

recurringInstant
date, time

A specific time that recurs over a timeDuration

binary

N/A

Binary data

uri
enumeration

A Uniform Resource Indicator (URI)

Although we will only use a few of these in our examples, you can see that XML Schema provides a much more comprehensive set of data types than DTDs.

Start at the bottom

Complex data types, defined by the user, are also possible within schemas. These types consist of combinations of elements. For example, we can define a Book type as being made up of a Title element, a Contents element, and a Copyright element (realize that we have stopped using the namespace when referring to elements, as XML Schema sees only the element name, and later applies the namespace). These elements can in turn be user-defined types, made up of more elements. What results is a sort of hierarchical pyramid; at the base of this pyramid are elements with basic XML Schema data types. Built on this base are layers of user-defined types, until the root element is finally defined at the top of the pyramid.

Because of this structure, it is generally wise to start with the elements that comprise the base of the hierarchy; in other words, those elements that can be defined as standard XML Schema data types. This is a bit different than in DTDs, where the order of the elements within the XML document is typically followed, but it does result in an easier schema creation process. Looking at our XML document, we can determine which elements are “primitive” data types, shown in Table 4.5.

Table 4-5. “Primitive” Elements

Element Name

Type

                                 
                                 Title
string
Heading
string
Topic
string

With these elements determined, we can add each to our schema (see Example 4.12). For clarity, the example schema we build will omit the XML declaration and DOCTYPE declaration; although these will be a part of the final schema, they are left out to avoid clutter until the end of our schema creation.

Example 4-12. XML Schema with “Primitive” Elements

<schema targetNamespace="http://www.oreilly.com/catalog/javaxml"
        xmlns="http://www.w3.org/1999/XMLSchema"
        xmlns:JavaXML="http://www.oreilly.com/catalog/javaxml"
>

  <element name="Title" type="string" />
  <element name="Heading" type="string" />
  <element name="Topic" type="string" />

</schema>

If adding those elements seemed a little too easy to believe, great! It is that easy. By defining these “base” or “primitive” elements, we can now go on to construct our more complex elements.

User-defined data types

Similar to the way we started with our most atomic elements, we want to begin constructing our more complex elements at the bottom of the hierarchical pyramid of our document. This almost always means starting with the most nested level of elements and working outwards until the root element is reached. The most deeply nested elements in our example are Heading and Topic. Since we have already specified these elements as primitives, we can move outward a level, reaching the Chapter element. This element will be our first user-defined element, and it should be specified as being made up of one Heading element and one or more Topic elements. The complexType element within XML Schema allows us to define complex data types:

<complexType name="[Name of Type" >
  <[Element Specification]>
  <[Element Specification]>
  ...
</complexType>

By defining this name type, we can then assign the new type to our element. For our Chapter element, we can now create a ChapterType data type:

<complexType name="ChapterType" >
  ...
</complexType>

This creates the type, and of course makes that type a part of our target namespace, JavaXML. So to assign the type to our Chapter element, we can use the following element specification:

<element name="Chapter" type="JavaXML:ChapterType" />

Now whatever element structure we specify within the ChapterType element type will determine the constraints on the Chapter element. Also notice that the type of element referred to is JavaXML:ChapterType, not simply ChapterType. When the type was created, it was created within the target namespace, JavaXML. But the elements we have been using within the schema (element, complexType, etc.) are not prefixed with a namespace, as they belong to the default namespace, which is the XML Schema namespace. So if we tried to specify the type as simply ChapterType, the parser would search the default namespace (that of XML Schema) for the type, not find the type, and raise an exception. To tell our parser where to find the type definition, we must give it the correct namespace, which in this case is JavaXML.

With the type body complete, we now need to fill in the details. For this element, we need to define within the schema the two elements that should be nested within this type. Because we have already specified the two elements that are nested (the Heading and Topic element primitives), we must refer to those element specifications from within our new type:

<complexType name="ChapterType" >
  <element ref="JavaXML:Heading" />
  <element ref="JavaXML:Topic" />
</complexType>

The ref attribute tells the XML parser that the definition for the element named is in another part of the schema. As in the case of specifying a type, we must tell the parser which namespace the elements are specified within, which is usually the target namespace. However, this is a bit redundant and verbose. We define the two elements as primitives, and then refer to them, resulting in four lines within our schema. But these elements are not used anywhere else within our document, so wouldn’t it be clearer if we could define the element within the type? This would avoid having to refer to the element, causing readers of your schema to have to scan through the rest of the schema to find an element that is only used here. In fact, this is exactly what you should do here. Element specifications can be nested within user-defined types, so we can refine our schema to be more self-documenting:

<element name="Title" type="string" /><element name="Chapter" type="JavaXML:ChapterType" />

<complexType name="ChapterType">
  <element name="Heading" type="string" />
                  
  <element name="Topic" type="string" />
</complexType>

In addition to removing needless lines of XML, we have removed extra references to the JavaXML namespace, which may help reduce confusion for newer XML authors when reading through your schema. With our new knowledge of user-defined types, we can define the rest of our XML documents’ elements, as in Example 4.13.

Example 4-13. XML Schema with All Elements Defined

<schema targetNamespace="http://www.oreilly.com/catalog/javaxml"
        xmlns="http://www.w3.org/1999/XMLSchema"
        xmlns:JavaXML="http://www.oreilly.com/catalog/javaxml"
>

  <element name="Book" type="JavaXML:BookType" />

  <complexType name="BookType">
    <element name="Title" type="string" />
    <element name="Contents" type="JavaXML:ContentsType" />
    <element name="Copyright" type="string" />
  </complexType>

  <complexType name="ContentsType">
    <element name="Chapter" type="JavaXML:ChapterType" />
    <element name="SectionBreak" type="string" />
  </complexType>

  <complexType name="ChapterType">
    <element name="Heading" type="string" />
    <element name="Topic" type="string" />
  </complexType>

</schema>

This neatly and cleanly results in every XML element used being defined, as well as having a very readable schema. However, there are still a few problems.

Implicit types and empty content

So far we have used only named types, often called explicit types. An explicit type is one in which a name is given to the type, and the element that uses the type is generally in a different section of the file. This is very object-oriented, as the same explicit type could be used as the type for several different elements. However, there may be times when this level of structure is not needed; in other words, a type is so specific to the element it is assigned to that naming the type is not at all useful. In our example, we could consolidate the definition of the Chapter element by defining its type within its element definition. This is done using an implicit type , sometimes called a nameless type :

<complexType name="ContentsType" >
  <element name="Chapter">
    <complexType>
      <element name="Heading" type="string" />
      <element name="Topic" type="string" />
    </complexType>
  </element>
  <element name="SectionBreak" type="string" />
</complexType>

This implicit type allows even more “streamlining” of a schema. However, no other element can be of the same type as defined within an implicit type, unless another implicit type is defined. In other words, only use implicit types when you are positive that the type will never be needed by multiple elements.

In addition to using implicit types for user-defined data types, they can also be used to specify information about the elements they are defining. For example, we currently have defined the type of SectionBreak as a string. This isn’t really accurate, as we want to make the element an empty element. We can define the content of the element as empty by using an implicit type:

<element name="SectionBreak" >
  <complexType content="empty" />
</element>

This may seem a little strange; why can’t we simply assign an “empty” data type to the element? Did the XML Schema authors leave this out? Actually, just the reverse; earlier versions of the XML Schema specification defined an empty data type, but it has since been removed. This is to require the definition of an element type. To see why, consider that most elements that are empty may have attributes that are used to specify data:

<img src="images/myGif.gif" />
<comment text="Here is a comment" />

In these cases, specifying the type as empty would not allow an intuitive way to define what attributes are allowed for the empty element. However, by using a type for the element, this can be defined:

<element name="img" >
  <complexType content="empty">
    <attribute name="src" type="string" />
  </complexType>
</element>

We will talk more about how these attributes are defined in the next section. For now, though, you should see that using implicit types can help us design our schema more intuitively, as well as allow the definition of more element properties, such as an element being empty.

How many?

The last item left to specify in our elements is their recurrence (or lack thereof!). A schema behaves similarly to a DTD in that for an element specification with no modifiers, the element must appear exactly one time. This is not always the desired case, as we found out in DTDs. Our book may have many chapters, may have no section break, and might have some chapters with headings and some without. We need to be able to specify these details in our schema. Like DTDs, there is a mechanism to do this, but unlike DTDs, an intuitive set of attributes is provided to specify these details, instead of the more cryptic recurrence operators in DTDs (?, +, *). In XML Schema, the attributes minOccurs and maxOccurs are used within an element specification:

<element name="[Element Name]" 
         type="[Element Type]"
         minOccurs="[Minimum times allowed to occur]"
         maxOccurs="[Maximum times allowed to occur]"
>

Both these attributes, when unspecified, default to the value “1”, resulting in the single required element per definition already discussed. If a maximum finite value is not determined, a wildcard character can be used to indicate an occurrence an unlimited number of times. These constructs allow easy additions to our schema setting the recurrence constraints on our defined elements, as shown in Example 4.14.

Example 4-14. XML Schema Complete with Element Specifications

<schema targetNamespace="http://www.oreilly.com/catalog/javaxml"
        xmlns="http://www.w3.org/1999/XMLSchema"
        xmlns:JavaXML="http://www.oreilly.com/catalog/javaxml">

  <element name="Book" type="JavaXML:BookType" />

  <complexType name="BookType">
    <element name="Title" type="string" />
    <element name="Contents" type="JavaXML:ContentsType" />
    <element name="Copyright" type="string" />
  </complexType>

  <complexType name="ContentsType">
    <element name="Chapter" maxOccurs="*">
      <complexType>
        <element name="Heading" type="string" minOccurs="0" />
        <element name="Topic" type="string" maxOccurs="*" />
      </complexType>
    </element>
    <element name="SectionBreak" minOccurs="0" maxOccurs="*">
      <complexType content="empty" />
    </element>
  </complexType>

</schema>

Looking at this, we have defined a single root element, Book, of type BookType. This element has three immediate child elements: Title, Contents, and Copyright. Of these, two are primitive XML strings, and the third (Contents) is another user-defined type, ContentsType. This element type, in turn, has a child element Chapter, which can appear one or more times, and a child element SectionBreak, which doesn’t have to appear at all. The Chapter element has two nested elements, Heading and Topic. Each is a primitive XML string, and while Heading can appear zero or one times, Topic can appear one or more times. The SectionBreak element can appear zero or more times, and is an empty element. Our schema now has all the elements specified and detailed; all that is left is to add the attributes to the schema.

Defining Attributes

The process of defining attributes is much simpler than that of specifying elements, primarily because many of the considerations within elements are not present when determining what attributes can be used for an element. By default, an attribute does not have to appear, and nesting concerns are not relevant, as attributes are not nested within other attributes. Although there are many advanced constructs that can be used to handle attribute constraints, we only look at some of the basic ones we need to constrain our XML document. The XML Schema specification should be consulted for the more advanced features that XML Schema offers in regards to attribute definitions.

What’s left out

There are some important omissions when constraining attributes for an XML document; all of these relate to the various namespace definitions in the referring document. An XML document, as discussed, must make several namespace definitions to refer to a schema, plus those definitions that apply to its own content. These are all accomplished through the xmlns:[Namespace] attribute in the root document element. None of these attributes should be defined in a schema. Trying to define every allowed namespace would result in a very confusing schema. Additionally, the location of the namespace declaration does not have to be fixed; as long as the namespace is available to all elements within it, the declaration can be relocated. For these reasons, the XML Schema group allows the omission of all namespace attribute definitions within a schema.

If you remember our section on DTDs, this is quite a change. For our DTD, we had to make an attribute definition as follows to allow the namespace declarations we made in our XML document:

<!ATTLIST JavaXML:Book
      xmlns:JavaXML CDATA #REQUIRED
>

To use a DTD, we didn’t have to do anything but specify the namespace in our XML document, as DTDs don’t have any “knowledge” of XML namespaces. This is a bit more complicated in XML Schema.

If you remember from our introductory discussion, there are actually three different attributes that are used to specify a schema for a document. These are repeated here to refresh your memory:

<addressBook xmlns:xsi="http://www.w3.org/1999/XMLSchema/instance"
             xmlns="http://www.oreilly.com/catalog/javaxml"
             xsi:schemaLocation="http://www.oreilly.com/catalog/javaxml/
                                 mySchema.xsd"
>

If you were going to write a schema based on your knowledge of DTDs, you would probably get ready to declare that the xmlns:xsi , xmlns , and xsi:schemaLocation attributes are all legal attributes for this root element. However, these declarations can be omitted, as XML Schema is namespace-aware, and is “smart” enough to not require that such declarations be defined in the XML document being constrained.

The definition

The attribute definition is accomplished through XML Schema’s attribute element (confusing, isn’t it?). In other words, similar to the element element, XML Schema defines an attribute element by which to specify which attributes are allowed for the enclosing element or type definition. The format of these is:

<attribute name="[Name of attribute]"
           type="[Type of Attribute]"
           [Attribute Options]
>

This should look very similar to how elements are defined, and in fact is almost identical. The same data types are available for attributes as are for elements. This means we can very easily add the attribute definitions to our schema. For any element with a type defined, we add the needed attributes within the type definition. For elements that do not currently have a type defined, we must add one. This is to let our schema know that the attributes we are declaring “belong” to the enclosing element type. In these new element types, we can specify the content type with the content attribute of the contentType element, preserving the original constraints, and add the attribute definitions. These changes result in the schema shown in Example 4.15.

Example 4-15. XML Schema with Attribute Definitions

<schema targetNamespace="http://www.oreilly.com/catalog/javaxml"
        xmlns="http://www.w3.org/1999/XMLSchema"
        xmlns:JavaXML="http://www.oreilly.com/catalog/javaxml">

  <element name="Book" type="JavaXML:BookType" />

  <complexType name="BookType">
    <element name="Title" type="string" />
    <element name="Contents" type="JavaXML:ContentsType" />
    <element name="Copyright" type="string" />
  </complexType>

  <complexType name="ContentsType">
    <element name="Chapter" maxOccurs="*">
      <complexType>
        <element name="Heading" type="string" minOccurs="0" />
        <element name="Topic" maxOccurs="*">
          <complexType content="string"> 
            <attribute name="subSections" type="integer" />
          </complexType>
        </element>
        <attribute name="focus" type="string" />
      </complexType>
    </element>
    <element name="SectionBreak" minOccurs="0" maxOccurs="*">
      <complexType content="empty" />
    </element>
  </complexType>

</schema>

You can see in the Topic element that we must create a type for the purpose of defining the subSections attribute. Within this type, we use the content attribute to require that the element’s content be of type integer. This is the same functionality we used earlier to assign SectionBreak a type of empty to ensure it remained an empty element. The other attributes added required less modification, as types already existed for these more complex elements.

Required attributes, default values, and enumerations

All that is left to complete our schema is a set of odds and ends in our attribute definitions. Remember that we used the keywords #IMPLIED, #FIXED, and #REQUIRED to specify if attributes had to appear and whether they were assigned default values if not included in an XML document. As in the case of the recurrence operators on elements, XML Schema has refined how these constraints are notated, making them clearer. For requiring an attribute, the same minOccurs attribute used for element specifications can be used, and assigning a value of “1” effectively makes an attribute mandatory. In our example, if we wanted to ensure that an attribute called section is required for the Chapter element, we could add a line as follows:

<attribute name="section" type="string" minOccurs="1" />

Although we mentioned that the default for elements was for any defined element to occur a single time (minOccurs would default to 1), attributes are not required, and minOccurs defaults to when defining an attribute.

The notion of a fixed value for attributes (#FIXED) is not employed in XML Schema; as we discussed earlier, it is not used commonly and is not an intuitive construct. However, specifying a default value for an attribute is a useful construct, and is handled quite simply by the default attribute of an attribute definition. For example, we determined that the default value for the focus attribute of the Chapter element should be “Java”:

<attribute name="focus" type="string" default="Java" />

Hopefully, you are starting to love the simplicity and elegance of XML Schema! The intuitive choices of element and attribute names go a long way towards making XML significantly easier to constrain than with the mechanism that DTDs provided. To demonstrate this even further, let’s look at the final option we want to use: enumerations.

For our focus attribute, we had used our DTD to specify that only the values Java and XML were allowed. Using parenthetical notation and the OR operator, we handled this like so:

<!ATTLIST JavaXML:Chapter
      focus (XML|Java) "Java"
>

While this isn’t necessarily difficult, it is also not necessarily intuitive. The values allowed are not even in quotation marks, which is the de facto standard for representing data values. XML Schema, while requiring more lines of schema to achieve the same effect, makes this type of constraint much easier to follow. The attribute definition is opened up, and a simpleType element is used. This element allows an existing data type, such as string, to be narrowed in the values that it can take on. In this case, we want to include the two allowed enumerative values that the attribute can take on. Each of these values is specified with the enumeration element. We specify the base type of this element with the base keyword. Using all this information in concert, we can change our attribute definition for the focus attribute:

<attribute name="focus" default="Java">
  <simpleType base="string">
                  
    <enumeration value="XML" />
                  
    <enumeration value="Java" />
                  
  </simpleType>
</attribute>

Again, this is quite a bit more verbose than our DTD for the same resulting constraint, but significantly easier to understand and grasp, particularly for newer users of XML. With this change, we have now completed our schema, and set forth all the constraints of our earlier DTD, all in much more readable form (see Example 4.16).

Example 4-16. Completed XML Schema

<?xml version="1.0"?>

<schema targetNamespace="http://www.oreilly.com/catalog/javaxml"
        xmlns="http://www.w3.org/1999/XMLSchema"
        xmlns:JavaXML="http://www.oreilly.com/catalog/javaxml">

  <element name="Book" type="JavaXML:BookType" />

  <complexType name="BookType">
    <element name="Title" type="string" />
    <element name="Contents" type="JavaXML:ContentsType" />
    <element name="Copyright" type="string" />
  </complexType>

  <complexType name="ContentsType">
    <element name="Chapter" maxOccurs="*">
      <complexType>
        <element name="Heading" type="string" minOccurs="0" />
        <element name="Topic" maxOccurs="*">
          <complexType content="string"> 
            <attribute name="subSections" type="integer" />
          </complexType>
        </element>
        <attribute name="focus" default="Java">
          <simpleType base="string">
            <enumeration value="XML" />
            <enumeration value="Java" />
          </simpleType>
        </attribute>
      </complexType>
    </element>
    <element name="SectionBreak" minOccurs="0" maxOccurs="*">
      <complexType content="empty" />
    </element>
  </complexType>

</schema>

Get Java and XML now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.