RELAX NG

Book description

As developers know, the beauty of XML is that it is extensible, even to the point that you can invent new elements and attributes as you write XML documents. Then, however, you need to define your changes so that applications will be able to make sense of them and this is where XML schema languages come into play. RELAX NG (pronounced relaxing), the Regular Language Description for XML Core--New Generation is quickly gaining momentum as an alternative to other schema languages. Designed to solve a variety of common problems raised in the creation and sharing of XML vocabularies, RELAX NG is less complex than The W3C's XML Schema Recommendation and much more powerful and flexible than DTDs.RELAX NG is a grammar-based schema language that's both easy to learn for schema creators and easy to implement for software developers In RELAX NG, developers are introduced to this unique language and will learn a no-nonsense method for creating XML schemas. This book offers a clear-cut explanation of RELAX NG that enables intermediate and advanced XML developers to focus on XML document structures and content rather than battle the intricacies of yet another convoluted standard.RELAX NG covers the following topics in depth:

  • Introduction to RELAX NG
  • Building RELAX NG schemas using XML syntax
  • Building RELAX NG schemas using compact syntax, an alternative non-XML syntax
  • Flattening schemas to limit depth and provide reusability
  • Using external datatype libraries with RELAX NG
  • W3C XML Schema regular expressions
  • Writing extensible schemas
  • Annotating schemas
  • Generating schemas form different sources
  • Determinism and datatype assignment
and much more.If you're looking for a schema language that's easy to use and won't leave you in a labyrinth of obscure limitations, RELAX NG is the language you should be using. And only O'Reilly's RELAX NG gives you the straightforward information and everything else you'll need to take advantage of this powerful and intelligible language.

Publisher resources

View/Submit Errata

Table of contents

  1. A Note Regarding Supplemental Files
  2. Foreword by James Clark
  3. Foreword by Murata Makoto
  4. Preface
    1. Who Should Read This Book?
    2. Who Shouldn’t Read This Book?
    3. Organization of This Book
    4. Conventions Used in This Book
    5. Comments and Questions
    6. Powered by WikiML
    7. Acknowledgments
  5. I. Tutorial
    1. 1. What RELAX NG Offers
      1. Diversity
      2. Keeping Documents Independent of Applications
      3. Validation Has Many Aspects
      4. The Best Way to Validate XML Document Structures
      5. RELAX NG’s Diverse Applications
      6. RELAX NG as a Pivot Format
      7. Why Use Other Schema Languages?
    2. 2. Simple Foundations Are Beautiful
      1. Documents and Infosets
      2. Different Types of Schema Languages
      3. A Simple Example
      4. A Strong Mathematical Background
      5. Patterns, and Only Patterns
    3. 3. First Schema
      1. Getting Started
      2. First Patterns
        1. The text Pattern
        2. The attribute Pattern
        3. The element Pattern
        4. The optional Pattern
        5. The oneOrMore Pattern
        6. The zeroOrMore Pattern
      3. Complete Schema
        1. Constraining Number of Occurrences
        2. Creating “Russian Doll” Schemas
    4. 4. Introducing the Compact Syntax
      1. First Compact Patterns
        1. The text Pattern
        2. The attribute Pattern
        3. Element
        4. The optional Pattern
        5. The oneOrMore Pattern
        6. The zeroOrMore Pattern
      2. Full Schema
      3. XML or Compact?
    5. 5. Flattening the First Schema
      1. Defining Named Patterns
      2. Referencing Named Patterns
      3. The grammar and start Elements
      4. Assembling the Parts
      5. Problems That Never Arise
      6. Recursive Models
      7. Escaping Named Pattern Identifiers in the Compact Syntax
    6. 6. More Complex Patterns
      1. The group Pattern
      2. The interleave Pattern
      3. The choice Pattern
      4. Pattern Compositions
      5. Order Variation as a Source of Information
      6. Text and Empty Patterns, Whitespace, and Mixed Content
      7. Why Is It Called interleave?
      8. Mixed Content Models with Order
      9. A Restriction Related to interleave
      10. A Missing Pattern: Unordered Group
    7. 7. Constraining Text Values
      1. Fixed Values
      2. Co-Occurrence Constraints
      3. Enumerations
      4. Whitespace and RELAX NG Native Datatypes
      5. Using String Datatypes in Attribute Values
      6. When to Use String Datatypes
      7. Using Different Types in Each Value
      8. Exclusions
      9. Lists
      10. Data Versus Text
    8. 8. Datatype Libraries
      1. W3C XML Schema Type Library
        1. The Datatypes
          1. String datatypes
          2. URIs
          3. Qualified names
          4. Binary string-encoded datatypes
          5. Numeric datatypes
          6. Date and time formats
          7. Examples
        2. Facets
      2. DTD Compatibility Datatypes
      3. Which Library Should Be Used?
        1. Native Types Versus W3C XML Schema Datatypes
        2. DTD Versus W3C XML Schema Datatypes
    9. 9. Using Regular Expressions to Specify Simple Datatypes
      1. A Swiss Army Knife
      2. The Simplest Possible Pattern Facets
      3. Quantifying
      4. More Atoms
        1. Special Characters
        2. Wildcard
        3. Character Classes
          1. Classical Perl character classes
          2. Unicode character classes
          3. User-defined character classes
        4. Or-ing and Grouping
      5. Common Patterns
        1. String Datatypes
          1. Unicode blocks
          2. Counting words
          3. URIs
        2. Numeric and Float Types
          1. Leading zeros
          2. Fixed format
        3. Datetimes
          1. Time zones
    10. 10. Creating Building Blocks
      1. Using External References
        1. With Russian Doll Schemas
        2. With Flat Schemas
        3. Embedding Grammars
        4. Referencing Patterns in Parent Grammars
      2. Merging Grammars
        1. Merging Without Redefinition
        2. Merging and Replacing Definitions
        3. Combining Definitions
          1. Combining by choice
          2. Combining by interleave
        4. Why Can’t Definitions Be Defined by Group?
      3. A Real-World Example: XHTML 2.0
      4. Other Options
        1. A Possible Use Case
        2. XML Tools
        3. Text Tools
    11. 11. Namespaces
      1. A Ten-Minute Guide to XML Namespaces
      2. The Two Challenges of Namespaces
      3. Declaring Namespaces in Schemas
        1. Using the Default Namespace
        2. Using Prefixes
      4. Accepting Foreign Namespaces
        1. Constructing a Wildcard
        2. Using Wildcards
        3. Where Should Foreign Nodes Be Allowed?
        4. Traps to Avoid
        5. Adding Foreign Nodes Through Combination
      5. Namespaces, Building Blocks, and Chameleon Design
        1. Reexamining XHTML 2.0
        2. Putting a Chameleon in the Library
        3. Good Chameleon or Evil Chameleon?
    12. 12. Writing Extensible Schemas
      1. Extensible Schemas
        1. Working from a Fixed Result
          1. Providing a grammar and a start element
          2. Maximize granularity
          3. Defining named patterns for content rather than for elements
        2. Free Formats
          1. Be cautious with attributes
          2. Use order sparingly
          3. Use containers
        3. Restricting Existing Schemas
      2. The Case for Open Schemas
        1. More Name Classes
      3. Extensible and Open?
    13. 13. Annotating Schemas
      1. Common Principles for Annotating RELAX NG Schemas
        1. Annotation Using the XML Syntax
        2. Annotations Using the Compact Syntax
          1. Grammar annotations
          2. Initial annotations
          3. Following annotations
          4. Assembling the annotation syntax
          5. When initial annotations turn into following annotations
        3. Annotating Groups of Definitions
        4. Alternatives and Workarounds
          1. Why reinvent XML 1.0 comments and PIs?
          2. Annotation of value and param patterns
      2. Documentation
        1. Comments
        2. RELAX NG DTD Compatibility Comments
        3. XHTML Annotations
        4. DocBook Annotations
        5. Dublin Core Annotations
        6. SVG Annotations
        7. RDDL Annotations
      3. Annotation for Applications
        1. Annotations for Preprocessing
        2. Annotations for Conversion
          1. Annotations to generate DTDs
          2. Annotations to generate W3C XML Schema
          3. Schema Adjunct Framework
        3. Annotations for Extension
          1. Embedded Schematron rules
          2. XVIF
    14. 14. Generating RELAX NG Schemas
      1. Examplotron: Instance Documents as Schemas
        1. Ten-Minute Guide to Examplotron
        2. Use Cases
      2. Literate Programming
        1. Out of the Box
        2. Adding Bells and Whistles for RDDL
      3. UML
      4. Spreadsheets
    15. 15. Simplification and Restrictions
      1. Simplification
        1. Annotation Removal, Whitespace and Attribute Normalization, and Inheritance
        2. Retrieval of External Schemas
        3. Name Class Normalization
        4. Pattern Normalization
        5. First Set of Constraints
        6. Grammar Merge
        7. Schema Flattening
        8. Final Cleanup
      2. Restrictions
        1. Constraints on Attributes
          1. Bad example: attribute content model
          2. Bad example: attribute duplication
          3. Bad example: name class overlap
        2. Constraints on Lists
          1. Bad example: list and interleave
        3. Constraints on Except Patterns
        4. Constraints on Start Patterns
        5. Constraints on Content Models
        6. Limitations on interleave
          1. Bad example: more than one text pattern in interleave
    16. 16. Determinism and Datatype Assignment
      1. What Is Ambiguity?
        1. Ambiguity Versus Determinism
        2. Different Kinds of Ambiguity
          1. Regular expression ambiguities
          2. Ambiguous regular hedge grammars
          3. Name class ambiguity
          4. Ambiguous datatypes
      2. The Downsides of Ambiguous and Nondeterministic Content Models
        1. Instance Annotations
        2. Compatibility with W3C XML Schema
      3. Some Ideas to Make Disambiguation Easier
        1. Generalizing the Except Pattern
        2. Making Disambiguation Rules Explicit
        3. Accepting Ambiguity
  6. II. Reference
    1. 17. Element Reference
      1. Elements
    2. 18. Compact Syntax Reference
      1. EBNF Production Reference
    3. 19. Datatype Reference
      1. xsd:anyURI — URI (Uniform Resource Identifier)
      2. xsd:base64Binary — Binary content coded as “base64”
      3. xsd:boolean — Boolean (true or false)
      4. xsd:byte — Signed value of 8 bits
      5. xsd:date — Gregorian calendar date
      6. xsd:dateTime — Instant of time (Gregorian calendar)
      7. xsd:decimal — Decimal numbers
      8. xsd:double — IEEE 64-bit floating-point
      9. xsd:duration — Time durations
      10. xsd:ENTITIES — Whitespace-separated list of unparsed entity references
      11. xsd:ENTITY — Reference to an unparsed entity
      12. xsd:float — IEEE 32-bit floating-point
      13. xsd:gDay — Recurring period of time: monthly day
      14. xsd:gMonth — Recurring period of time: yearly month
      15. xsd:gMonthDay — Recurring period of time: yearly day
      16. xsd:gYear — Period of one year
      17. xsd:gYearMonth — Period of one month
      18. xsd:hexBinary — Binary contents coded in hexadecimal
      19. xsd:ID — Definition of unique identifiers
      20. xsd:IDREF — Definition of references to unique identifiers
      21. xsd:IDREFS — Definition of lists of references to unique identifiers
      22. xsd:int — 32-bit signed integers
      23. xsd:integer — Signed integers of arbitrary length
      24. xsd:language — RFC 1766 language codes
      25. xsd:long — 64-bit signed integers
      26. xsd:Name — XML 1.O name
      27. xsd:NCName — Unqualified names
      28. xsd:negativeInteger — Strictly negative integers of arbitrary length
      29. xsd:NMTOKEN — XML 1.0 name token (NMTOKEN)
      30. xsd:NMTOKENS — List of XML 1.0 name tokens (NMTOKEN)
      31. xsd:nonNegativeInteger — Integers of arbitrary length positive or equal to zero
      32. xsd:nonPositiveInteger — Integers of arbitrary length negative or equal to zero
      33. xsd:normalizedString — Whitespace-replaced strings
      34. xsd:NOTATION — Emulation of the XML 1.0 feature
      35. xsd:positiveInteger — Strictly positive integers of arbitrary length
      36. xsd:QName — Namespaces in XML-qualified names
      37. xsd:short — 32-bit signed integers
      38. xsd:string — Any string
      39. xsd:time — Point in time recurring each day
      40. xsd:token — Whitespace-replaced and collapsed strings
      41. xsd:unsignedByte — Unsigned value of 8 bits
      42. xsd:unsignedInt — Unsigned integer of 32 bits
      43. xsd:unsignedLong — Unsigned integer of 64 bits
      44. xsd:unsignedShort — Unsigned integer of 16 bits
  7. III. Appendixes
    1. A. DSDL
      1. A Multipart Standard
        1. Part 1: Overview
        2. Part 2: Regular Grammar-Based Validation
        3. Part 3: Rule-Based Validation
        4. Part 4: Selection of Validation Candidates
        5. Part 5: Datatypes
        6. Part 6: Path-Based Integrity Constraints
        7. Part 7: Character Repertoire Validation
        8. Part 8: Declarative Document Architectures
        9. Part 9: Namespace- and Datatype-Aware DTDs
        10. Part 10: Validation Management
      2. What DSDL Should Bring You
    2. B. The GNU Free Documentation License
      1. GNU Free Documentation License
      2. 0. Preamble
      3. 1. APPLICABILITY AND DEFINITIONS
      4. 2. VERBATIM COPYING
      5. 3. COPYING IN QUANTITY
      6. 4. MODIFICATIONS
      7. 5. COMBINING DOCUMENTS
      8. 6. COLLECTIONS OF DOCUMENTS
      9. 7. AGGREGATION WITH INDEPENDENT WORKS
      10. 8. TRANSLATION
      11. 9. TERMINATION
      12. 10. FUTURE REVISIONS OF THIS LICENSE
      13. Addendum: How to use this License for your documents
  8. Glossary
  9. Index
  10. About the Author
  11. Colophon
  12. Copyright

Product information

  • Title: RELAX NG
  • Author(s): Eric van der Vlist
  • Release date: December 2003
  • Publisher(s): O'Reilly Media, Inc.
  • ISBN: 9780596004217