O'Reilly logo

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

XML Schema

Book Description

If you need to create or use formal descriptions of XML vocabularies, the W3C's XML Schema offers a powerful set of tools for defining acceptable document structures and content. An alternative to DTDs as the way to describe and validate data in an XML environment, XML Schema enables developers to create precise descriptions with a richer set of datatypes?such as booleans, numbers, currencies, dates and times?that are essential for today?s applications.Schemas are powerful, but that power comes with substantial complexity. This concise book explains the ins and outs of XML Schema, including design choices, best practices, and limitations. Particularly valuable are discussions of how the type structures fit with existing database and object-oriented program contexts. With XML Schema, you can define acceptable content models and annotate those models with additional type information, making them more readily bound to programs and objects. Schemas combine the easy interchange of text-based XML with the more stringent requirements of data exchange, and make it easier to validate documents based on namespaces.You?ll find plenty of examples in this book that demonstrate the details necessary for precise vocabulary definitions. Topics include:

  • Foundations of XML Schema syntax
  • Flat, "russian-doll", and other schema approaches
  • Working with simple and complex types in a variety of contexts
  • The built-in datatypes provided by XML Schema
  • Using facets to extend datatypes, including regular expression-based patterns
  • Using keys and uniqueness rules to limit how and where information may appear
  • Creating extensible schemas and managing extensibility
  • Documenting schemas and extending XML Schema capabilities through annotations
In addition to the explanatory content, XML Schemaprovides a complete reference to all parts of both the XML Schema Structures and XML Schema Datatypes specifications, as well as a glossary. Appendices explore the relationships between XML Schema and other tools for describing document structures, including DTDs, RELAX NG, and Schematron, as well as work in progress at the W3C to more tightly integrate XML Schema with existing specifications.No matter how you intend to use XML Schema - for data structures or document structures, for standalone documents or part of SOAP transactions, for documentation, validation, or data binding ? all the foundations you need are outlined in XML Schema.

Table of Contents

  1. A Note Regarding Supplemental Files
  2. Preface
    1. Who Should Read This Book?
    2. Who Should Not Read This Book?
    3. About the Examples
    4. Organization of This Book
    5. Conventions Used in This Book
    6. How to Contact Us
    7. Acknowledgments
  3. 1. Schema Uses and Development
    1. What Schemas Do for XML
      1. Validation
      2. Documentation
      3. Querying Support
      4. Data Binding
      5. Guided Editing
    2. W3C XML Schema
  4. 2. Our First Schema
    1. The Instance Document
    2. Our First Schema
    3. First Findings
      1. W3C XML Schema Is Modular
      2. W3C XML Schema Is Both About Structure and Datatyping
      3. Flat Design, Global Components
  5. 3. Giving Some Depth to Our First Schema
    1. Working From the Structure of the Instance Document
    2. New Lessons
      1. Depth Versus Modularity?
      2. Russian Doll and Object-Oriented Design
      3. Where Have the Element Types Gone?
  6. 4. Using Predefined Simple Datatypes
    1. Lexical and Value Spaces
    2. Whitespace Processing
    3. String Datatypes
      1. No Whitespace Replacement
      2. Normalized Strings
      3. Collapsed Strings
        1. Tokenss
        2. Qualified names
        3. URIs
        4. Notations
        5. Binary string-encoded datatypes
    4. Numeric Datatypes
      1. Decimal Types
      2. Float Datatypes
      3. xs:boolean
    5. Date and Time Datatypes
      1. The Realm of ISO 8601
      2. Datatypes
    6. List Types
    7. What About anySimpleType?
    8. Back to Our Library
  7. 5. Creating Simple Datatypes
    1. Derivation By Restriction
      1. Facets
        1. Whitespace collapsed strings
          1. xs:enumeration
          2. xs:length
          3. xs:maxLength
          4. xs:minLength
          5. xs:pattern
        2. Other strings
          1. xs:whiteSpace
        3. Float datatypes
          1. xs:enumeration
          2. xs:maxExclusive
          3. xs:maxInclusive
          4. xs:minExclusive
          5. xs:minInclusive
          6. xs:pattern
        4. Date and time datatypes
          1. xs:enumeration
          2. xs:maxExclusive
          3. xs:maxInclusive
          4. xs:minExclusive
          5. xs:minInclusive
          6. xs:pattern
        5. Integer and derived datatypes
          1. xs:totalDigits
        6. Decimals
          1. xs:fractionDigits
        7. Booleans
          1. xs:pattern
        8. List datatypes
      2. Multiple Restrictions and Fixed Attribute
        1. Facet that can be changed but needs to be more restrictive
        2. Facet that cannot be changed
        3. Facet that performs the intersection of the lexical spaces
        4. Facet that does its job before the lexical space
        5. Fixed facets
    2. Derivation By List
      1. List Datatypes
    3. Derivation By Union
    4. Some Oddities of Simple Types
      1. Beware of the Order
      2. Using or Abusing Lists to Change the Behavior of Length Constraining Facets
    5. Back to Our Library
  8. 6. Using Regular Expressions to Specify Simple Datatypes
    1. The Swiss Army Knife
    2. The Simplest Possible Patterns
    3. Quantifying
    4. More Atoms
      1. Special Characters
      2. Wildcard
      3. Character Classes
        1. Classical Perl character classes
        2. Unicode character classes
        3. User-defined character classes
      4. Oring and Grouping
    5. Common Patterns
      1. String Datatypes
        1. Unicode blocks
        2. Counting words
        3. URIs
      2. Numeric and Float Types
        1. Leading zeros
        2. Fixed format
      3. Datetimes
        1. Time zones
    6. Back to Our Library
  9. 7. Creating Complex Datatypes
    1. Simple Versus Complex Types
    2. Examining the Landscape
      1. Content Models
      2. Named Versus Anonymous Types
      3. Creation Versus Derivation
    3. Simple Content Models
      1. Creation of Simple Content Models
      2. Derivation from Simple Contents
        1. Derivation by extension
        2. Derivation by restriction
        3. Comparison of these two methods
    4. Complex Content Models
      1. Creation of Complex Content
        1. Compositors and particles
        2. Element and attribute groups
        3. Unique Particle Attribution Rule
        4. Consistent Declaration Rule
        5. Limitations on unordered content models
          1. Limitations of xs:all
          2. Adapting the structure of your document
          3. Using xs:choice instead of xs:all
      2. Derivation of Complex Content
        1. Derivation by extension
        2. Derivation by restriction
        3. Asymmetry of these two methods
    5. Mixed Content Models
      1. Creating Mixed Content Models
      2. Derivation of Mixed Content Models
        1. Derivation by extension
        2. Derivation by restriction
        3. Derivation between complex and mixed content models
    6. Empty Content Models
      1. Creation of Empty Content Models
        1. As simple content models
        2. As complex content models
      2. Derivation of Empty Content Models
        1. Derivation by extension
        2. Derivation by restriction
      3. Simple or Complex Content Models for Empty Content Models?
    7. Back to Our Library
    8. Derivation or Groups
  10. 8. Creating Building Blocks
    1. Schema Inclusion
    2. Schema Inclusion with Redefinition
      1. Redefining of Simple and Complex Types
      2. Redefinition of Element and Attribute Groups
        1. Extension
        2. Restriction
    3. Other Alternatives
      1. External Parsed Entities
      2. XInclude
    4. Simplifying the Library
  11. 9. Defining Uniqueness, Keys, and Key References
    1. xs:ID and xs:IDREF
    2. XPath-Based Identity Checks
      1. Uniqueness
      2. Composite Fields
      3. Keys
      4. Key References
      5. Permitted XPath Expressions
    3. ID/IDREF Versus xs:key/xs:keyref
    4. Using xs:key and xs:unique As Co-occurrence Constraints
  12. 10. Controlling Namespaces
    1. Namespaces Present Two Challenges to Schema Languages
    2. Namespace Declarations
    3. To Qualify Or Not to Qualify?
    4. Disruptive Attributes
    5. Namespaces and XPath Expressions
    6. Referencing Other Namespaces
    7. Schemas for XML, XML Base and XLink
      1. XML Attributes
      2. XLink Attributes
    8. Namespace Behavior of Imported Components
    9. Importing Schemas with No Namespaces
    10. Chameleon Design
    11. Allowing Any Elements or Attributes from a Particular Namespace
  13. 11. Referencing Schemas and Schema Datatypes in XML Documents
    1. Associating Schemas with Instance Documents
    2. Defining Element Types
      1. Defining Simple Types
      2. Defining Complex Types
    3. Defining Nil (Null) Values
    4. Beware the Intrusive Nature of These Features...
  14. 12. Creating More Building Blocks Using Object-Oriented Features
    1. Substitution Groups
      1. Using a “Traditional” Group
      2. Substitution Groups
        1. Using substitution groups
        2. Abstract elements
        3. Trees of substitution groups
      3. Traditional Declarations or Substitution Groups?
      4. Fuzzy Recommendation
        1. Extension of xs:choice through group redefinitions
        2. Restricting substitution groups
    2. Controlling Derivations
      1. Attributes
      2. Elements
        1. Block attribute
        2. Final elements
        3. Abstract elements
      3. Complex Types
        1. Blocking complex types
        2. Final complex types
        3. Abstract complex types
      4. Simple Types
      5. Other Components and Redefinitions
  15. 13. Creating Extensible Schemas
    1. Extensible Schemas
      1. Global Components
        1. Elements
        2. Attributes
      2. final and fixed Attributes
      3. Splitting Schema Components
    2. The Need for Open Schemas
      1. xsi:type
      2. Wildcards
      3. And Substitution Groups?
  16. 14. Documenting Schemas
    1. Style Matters
      1. Keep It Simple
      2. Think Globally
      3. When It’s Similar, Show It
    2. The W3C XML Schema Annotation Element
    3. Foreign Attributes
    4. XML 1.0 Comments
    5. Which One and What For?
  17. 15. Elements Reference Guide
    1. xs:all(outside a group) — Compositor describing an unordered group of elements.
    2. xs:all(within a group) — Compositor describing an unordered group of elements. The number of occurrences cannot be defined when xs:all is used within a group.
    3. xs:annotation — Informative data for human or electronic agents.
    4. xs:any — Wildcard to replace any element.
    5. xs:anyAttribute — Wildcard to replace any attribute.
    6. xs:appinfo — Information for applications.
    7. xs:attribute(global definition) — Global attribute definition that can be referenced within the same schema by other schemas.
    8. xs:attribute(reference or local definition) — Reference to a global attribute definition or local definition (local definitions cannot be referenced).
    9. xs:attributeGroup(global definition) — Global attributes group declaration that can be referenced within the same schema by other schemas.
    10. xs:attributeGroup(reference) — Reference to a global attributes group declaration.
    11. xs:choice(outside a group) — Compositor to define group of mutually exclusive elements or compositors.
    12. xs:choice(within a group) — Compositor to define group of mutually exclusive elements or compositors. The number of occurrences cannot be defined when xs:choice is used within a group.
    13. xs:complexContent — Definition of a complex content by derivation of a complex type.
    14. xs:complexType(global definition) — Global definition of a complex type that can be referenced within the same schema by other schemas.
    15. xs:complexType(local definition) — Complex type local definition (local definitions cannot be referenced).
    16. xs:documentation — Human-targeted documentation.
    17. xs:element(global definition) — Global element definition that can be referenced within the same schema by other schemas.
    18. xs:element(within xs:all) — Reference to a global element declaration or local definition (local definitions cannot be referenced). The number of occurrences can only be zero or one when xs:element is used within xs:all..
    19. xs:element(reference or local definition) — Reference to a global element declaration or local definition (local definitions cannot be referenced).
    20. xs:enumeration — Facet to restrict a datatype to a finite set of values.
    21. xs:extension(simple content) — Extension of a simple content model.
    22. xs:extension(complex content) — Extension of a complex content model.
    23. xs:field — Definition of the field to use for a uniqueness constraint.
    24. xs:fractionDigits — Facet to define the number of fractional digits of a numerical datatype.
    25. xs:group(definition) — Global elements group declaration that can be referenced within the same schema by other schemas.
    26. xs:group(reference) — Reference to a global elements group declaration or local definition (local definitions cannot be referenced).
    27. xs:import — Import of a W3C XML Schema for another namespace.
    28. xs:include — Inclusion of a W3C XML Schema for the same target namespace.
    29. xs:key — Definition of a key.
    30. xs:keyref — Definition of a key reference.
    31. xs:length — Facet to define the length of a value.
    32. xs:list — Derivation by list.
    33. xs:maxExclusive — Facet to define a maximum (exclusive) value.
    34. xs:maxInclusive — Facet to define a maximum (inclusive) value.
    35. xs:maxLength — Facet to define a maximum length.
    36. xs:minExclusive — Facet to define a minimum (exclusive) value.
    37. xs:minInclusive — Facet to define a minimum (inclusive) value.
    38. xs:minLength — Facet to define a minimum length.
    39. xs:notation — Declaration of a notation.
    40. xs:pattern — Facet to define a regular expression pattern constraint.
    41. xs:redefine — Inclusion of a W3C XML Schema for the same namespace with possible override.
    42. xs:restriction(simple type) — Derivation of a simple datatype by restriction.
    43. xs:restriction(simple content) — Derivation of a simple content model by restriction.
    44. xs:restriction(complex content) — Derivation of a complex content model by restriction.
    45. xs:schema — Document element of a W3C XML Schema.
    46. xs:selector — Definition of the the path selecting an element for a uniqueness constraint.
    47. xs:sequence(outside a group) — Compositor to define an ordered group of elements.
    48. xs:sequence(within a group) — Compositor to define an ordered group of elements. The number of occurrences cannot be defined when xs:all is used within a group.
    49. xs:simpleContent — Simple content model declaration.
    50. xs:simpleType(global definition) — Global simple type declaration that can be referenced within the same schema by other schemas.
    51. xs:simpleType(local definition) — Local simple type definition (local definitions cannot be referenced).
    52. xs:totalDigits — Facet to define the total number of digits of a numeric datatype.
    53. xs:union — Derivation of simple datatypes by union.
    54. xs:unique — Definition of a uniqueness constraint.
    55. xs:whiteSpace — Facet to define whitespace behavior.
  18. 16. Datatype Reference Guide
    1. xs:anyURI — URI (Uniform Resource Identifier).
    2. xs:base64Binary — Binary content coded as “base64”.
    3. xs:boolean — Boolean (true or false).
    4. xs:byte — Signed value of 8 bits.
    5. xs:date — Gregorian calendar date.
    6. xs:dateTime — Instant of time (Gregorian calendar).
    7. xs:decimal — Decimal numbers.
    8. xs:double — IEEE 64 bit floating point.
    9. xs:duration — Time durations.
    10. xs:ENTITIES — Whitespace separated list of unparsed entity references.
    11. xs:ENTITY — Reference to an unparsed entity.
    12. xs:float — IEEE 32 bit floating point.
    13. xs:gDay — Recurring period of time: monthly day.
    14. xs:gMonth — Recurring period of time: yearly month.
    15. xs:gMonthDay — Recurring period of time: yearly day.
    16. xs:gYear — Period of one year.
    17. xs:gYearMonth — Period of one month.
    18. xs:hexBinary — Binary contents coded in hexadecimal.
    19. xs:ID — Definition of unique identifiers.
    20. xs:IDREF — Definition of references to unique identifiers.
    21. xs:IDREFS — Definition of lists of references to unique identifiers.
    22. xs:int — 32 bit signed integers.
    23. xs:integer — Signed integers of arbitrary length.
    24. xs:language — RFC 1766 language codes.
    25. xs:long — 64 bit signed integers.
    26. xs:Name — XML 1.O names.
    27. xs:NCName — Unqualified names.
    28. xs:negativeInteger — Strictly negative integers of arbitrary length.
    29. xs:NMTOKEN — XML 1.0 name token (NMTOKEN).
    30. xs:NMTOKENS — List of XML 1.0 name token (NMTOKEN).
    31. xs:nonNegativeInteger — Integers of arbitrary length positive or equal to zero.
    32. xs:nonPositiveInteger — Integers of arbitrary length negative or equal to zero.
    33. xs:normalizedString — Whitespace-replaced strings.
    34. xs:NOTATION — Emulation of the XML 1.0 feature.
    35. xs:positiveInteger — Strictly positive integers of arbitrary length.
    36. xs:QName — Namespaces in XML qualified names.
    37. xs:short — 32 bit signed integers.
    38. xs:string — Any string.
    39. xs:time — Point in time recurring each day.
    40. xs:token — Whitespace-replaced and collapsed strings.
    41. xs:unsignedByte — Unsigned value of 8 bits.
    42. xs:unsignedInt — Unsigned integer of 32 bits.
    43. xs:unsignedLong — Unsigned integer of 64 bits.
    44. xs:unsignedShort — Unsigned integer of 16 bits.
  19. A. XML Schema Languages
    1. What Is a XML Schema Language?
      1. XML Schema Languages Are Not Schemas
      2. Firewalls Against Diversity
      3. Intrusive Modeling Tools
      4. Early Binding Tools
    2. Classification of XML Schema Languages
      1. Rule-Based XML Schema Languages
      2. Grammar-Based XML Schema Languages
      3. Object-Oriented XML Schema Languages
    3. A Short History of XML Schema Languages
      1. The DTD Family
      2. The W3C XML Schema Family
      3. The RELAX NG Family
      4. Schematron
      5. Examplotron
    4. Sample Application
    5. XML DTDs
      1. Example
    6. W3C XML Schema
      1. Example
    7. RELAX NG
      1. Example
    8. Schematron
      1. Example
    9. Examplotron
      1. Example
    10. Decisions
  20. B. Work in Progress
    1. W3C Projects
      1. XPath, XSLT, and XQuery
      2. DOM
      3. RDF
    2. ISO: DSDL
    3. Other
      1. PSVI Serialization
      2. APIs
      3. Schema Extensions: Error Messages
  21. Glossary
  22. Index
  23. About the Author
  24. Colophon
  25. Copyright