BUY THIS BOOK
Add to Cart

Print Book $29.95


Safari Books Online

What is this?

Add to UK Cart

Print Book £20.95

What is this?

Looking to Reprint this content?


XForms Essentials
XForms Essentials By Micah Dubinko
August 2003
Pages: 232

Cover | Table of Contents | Colophon


Table of Contents

Chapter 1: Introduction to Web Forms
"In the beginning...the earth was without form, and void."
—Genesis 1:1, 2
How common are forms on the Web? Well, on a recent visit to the news site http://www.cnn.com, I counted six separate forms:
  1. A navigation list
  2. A search tool
  3. A stock quote tool
  4. A language picker
  5. A community poll
  6. A weather forecast tool
As a general rule, the more interactive a web site is, the more heavily the site's designers rely on web forms, a general term for all different kinds of technologies used to gather information from users. It is easy to see why this is the case—without forms, web sites are far less interesting. Form-less web sites were the norm in the early days of the Web and provided a one-way deluge of static information, similar to the Sunday newspaper, which requires lots of navigation to get to any specific part and contains countless pages that get printed but never read.
The addition of forms to Hypertext Markup Language (HTML), the primary language used in web pages, launched an entirely new way of surfing the Web. In this book, I use the term HTML forms to refer to the form element and related markup from either HTML or XHTML. Using HTML forms, searching for information became possible on a worldwide scale. Sites such as Yahoo! quickly became the most popular "portals" of entry on the Web. Later, as developers pushed the limits of forms technology farther, web sites became even more interactive and customizable. In return for a small piece of information, such as a postal code, the browsing experience could be reshaped to include what specific information visitors were looking for—leaving out the rest. HTML forms have proven so successful in this regard that newer web technologies, such as PDF forms and Flash, have been unable to make a significant dent in their popularity.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
The Past, Present, and Future of Web Forms
Scientists and science fiction writers have long predicted many of the things now being made possible by web forms. For example, in a 1945 article in The Atlantic Monthly, Vannevar Bush wrote about a hypertext network he dubbed a "Memex." Even at this conceptual stage, the thought of using forms to access data came naturally, particularly in terms of drilling down through vast stores of information: "One might, for example, speak to a microphone, in the manner described in connection with the speech-controlled typewriter, and thus make his selections." How did such a technology come to be in real life?
Shortly after the initial tempering of HTML, various individuals began considering the usefulness of forms alongside hypertext. HTML Version 2.0, as presented in a document called Request for Comments (RFC) 1866, was the first time that web forms were seriously considered for standardization. That RFC captured HTML as found in common use prior to June 1994. At this point, HTML already included forms, thanks to a 1993 proposal called HTML+.
Care and maintenance of the HTML family of specifications have since been handed over to the World Wide Web Consortium, or W3C. The last non-XML-based version of HTML was version 4.01, which didn't change forms processing much. New development of the standard is taking place on a closely related technology called XHTML, where the X indicates an XML foundation. XHTML 1.0 and 1.1 were largely concerned with details of the transition to XML and ways to combine vocabularies, not with major changes to the language.
XHTML 2.0, in contrast, is making some improvements that aren't compatible with earlier flavors of HTML. The largest such change is the adoption of XForms as a replacement for the older HTML forms technology. As of August 2003, XHTML 2.0 is still under development, though it's clear that XForms will play a major role in the future of XHTML. Before we discuss XForms, however, a review of the older HTML forms technology will be helpful.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
A Brief Review of HTML Forms
The introduction of the forms chapter in HTML 4.01 reads: "An HTML form is a section of a document containing normal content, markup, special elements called controls (checkboxes, radio buttons, menus, etc.), and labels on those controls. Users generally 'complete' a form by modifying its controls (entering text, selecting menu items, etc.), before submitting the form to an agent for processing (e.g., to a web server, to a mail server, etc.)."
The defining element for HTML forms is named, not too surprisingly, form. This element describes some important aspects of the form, including where and how to submit data. The content of this element consists of regular HTML markup, as well as controls.
Forms represent a structured exchange of data. In HTML forms, the structure of the collected data, called a form data set, is a set of name/value pairs. The names and values that are included in this set are solely determined by the controls present within the form, so that adding a new control element, as well as adding to the user interface, also adds a new name/value pair to the data set. Many authors take for granted this basic violation of the separation between the data layer and the user interface layer—a problem that XForms has gone to considerable lengths to alleviate.
Which control types are available in HTML forms? The following sections will answer this question.
The workhorse of HTML forms, this control permits the entry of any character data. Text input controls accept a string value and contribute it to the form data set. Example 1-1 shows the XHTML code needed to produce a basic single-line text control, and Figure 1-1 shows the result.
Example 1-1. XHTML code for a single-line text control
<input type="text" name="name" value="Dubinko, Micah"/>
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Limitations of HTML Forms, Advantages of XForms
According to developers, the most commonly cited problem with HTML forms is their dependency on scripting languages. Real-world HTML forms are reliant on script to accomplish many common tasks such as marking controls as required, performing validations and calculations, displaying error messages, and managing dynamic layout. This dependency results in complex documents, which are expensive and time-consuming to maintain, since a full-time programmer is practically necessary when dealing with that much script.
XForms helps reduce the need for script in several ways: by defining a framework for simple, XPath-based calculations and validations, by providing better user feedback on the status of the form, through dynamic features such as repeating tables and optional sections, and through a system of XForms Actions—elements that cause commonly needed actions such as setting focus or changing a data value.
A second limitation of HTML forms is the difficulty of initializing form data, as commonly happens when web sites "remember" past users and provide them the courtesy of not having to repeatedly enter information. As shown earlier, each form control has its own unique way of defining initial data, with small bits of initialization data spread all across the document. This means that in order to process a blank form into a filled form, either a new document needs to be constructed piece by piece, or an existing document needs to be patched in numerous places—one of the reasons why template-replacement facilities are commonly found in application servers. Constructing such forms is CPU-intensive and leads to bottlenecks on high-volume servers.
In XForms, form data is provided from an initial XML file, which can be external to the form definition. Since XForms is also flexible enough to deal directly with most XML data formats, piping initial data into a form is typically a simple matter of pointing a src attribute to an existing XML data source.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
The History of XForms
After a number of internal and published requirements documents, the first XForms draft specification was published on April 6, 2000. The title of this document, "Datamodeling Proposal for XForms," gave a strong hint about how undeveloped this initial effort was. In fact, the final versions of the XForms specification bear no resemblance at all to this first attempt.
Why was this? At the time the initial XForms Working Draft was under development, another W3C specification called "XML Schema" was gradually progressing through the W3C channels. In what later proved to be a costly diversion, the XForms Working Group initially decided to make the XForms data types differ from the ones in XML Schema, "due to different usage scenarios and target audiences." As an alternative, the specification spelled out a "simple syntax," which consisted of a number of XML tags such as string, money, and group, where the tags needed to be nested in a structure that mirrored the desired shape of the final XML that would be submitted. For example, to submit XML that looked like:
<poll>
  <vote>Vanilla</vote>
</poll>
You would have needed an XForms "data model" such as this:
<xform>
  <group name="poll">
    <string name="vote">
  </group>
</xform>
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
The Revenge of the Simple Syntax
The XForms "simple syntax" mentioned earlier served a worthy purpose: to make authors of existing HTML forms comfortable enough to consider making the jump to XForms. So, when the "simple syntax" went away, what replaced it? Literally nothing. Instead of trying to simplify form authoring by adding an additional layer of markup, the designers made XForms remain useful when removing a layer of markup. This extra layer is what needs to be written for the XForms Model, which can be safely omitted in forms of roughly the same complexity as an HTML form with no script. Unofficially, this became known as "lazy author" processing, in deference to the time-honored concept in software engineering of "constructive laziness," or the ability to recognize and actively bypass unnecessary work.
Example 1-14 shows a form that accomplishes the same goals as the earlier HTML form: a poll.
Example 1-14. A poll form implemented in XForms,"lazy author" style
<select1 ref="mainsel" appearance="radio" selection="open">
  <label>Poll: to be or not to be?</label>
   <item>
    <label>To Be</label>
    <value>b</value>
  </item>
  <item>
    <label>Not To Be</label>
    <value>n</value>
  </item>
</select1>
Note that no specific choice is needed for an "Other, please specify" selection, since XForms supports the concept of "open selection" lists, where the user is allowed to freely enter additional list values.
Additionally, to make the form submittable, a small bit of markup is required in the head section of the document, as seen in the following code.
<model>
  <submission action="http://example.info/xml-submit"/>
</model>
Unlike the proposed simple syntax, only a minimal amount of keyboard typing is needed for the XForms Model; in fact, little more than a URL to accept submitted data. The main part of the form is specified as user interface form controls: here, select1. Note, too, that the "Other, please specify" choice isn't needed, since XForms supports open selection lists natively. If the user manually entered
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 2: XForms Building Blocks
"What the world really needs is more love and less paperwork."
—Pearl Bailey
"XML lets organizations benefit from structured, predictable documents. Thus, XML breeds forms. QED."
—David Weinberger
The previous chapter ended with a look at the simple syntax of XForms. This chapter goes into greater detail on the concepts underlying the design of XForms, as well as practical issues that come into play, including a complete, annotated real-world example.
A key concept is the relationship between forms and documents, which will be addressed first. After that, this chapter elaborates on the important issue of host languages and how XForms integrates them.
Despite the name, XForms is being used for many applications beyond simple forms. In particular, creating and editing XML-based documents is a good fit for the technology.
A key advantage of XML-based documents over, say, paper or word processor templates, is that an entirely electronic process eliminates much uncertainty from form processing. Give average "information workers" a paper form, and they'll write illegibly, scribble in the margins, doodle, write in new choices, and just generally do things that aren't expected. All of these behaviors are manually intensive to patch up, in order to clean the data to a point where it can be placed into a database. With XForms, it is possible to restrict the parts of the document that a user is able to modify, which means that submitted data needs only a relatively light double-check before it can be sent to a database. (One pitfall to avoid, however, is a system that is excessively restrictive, so that the person filling the form is unable to accurately provide the needed data. When that happens, users typically give bad information or avoid the electronic system altogether.)
Various efforts are underway to define XML vocabularies for all sorts of documents. Perhaps one of the most ambitious is UBL, the Universal Business Language, currently being standardized through OASIS (the Organization for the Advancement of Structutured Information Standards). The goal of UBL is to represent all different sorts of business documents—purchase orders, invoices, order confirmations, and so on—using a family of XML vocabularies. XForms is a great tool with which to create and edit UBL documents.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
More Than Forms
Despite the name, XForms is being used for many applications beyond simple forms. In particular, creating and editing XML-based documents is a good fit for the technology.
A key advantage of XML-based documents over, say, paper or word processor templates, is that an entirely electronic process eliminates much uncertainty from form processing. Give average "information workers" a paper form, and they'll write illegibly, scribble in the margins, doodle, write in new choices, and just generally do things that aren't expected. All of these behaviors are manually intensive to patch up, in order to clean the data to a point where it can be placed into a database. With XForms, it is possible to restrict the parts of the document that a user is able to modify, which means that submitted data needs only a relatively light double-check before it can be sent to a database. (One pitfall to avoid, however, is a system that is excessively restrictive, so that the person filling the form is unable to accurately provide the needed data. When that happens, users typically give bad information or avoid the electronic system altogether.)
Various efforts are underway to define XML vocabularies for all sorts of documents. Perhaps one of the most ambitious is UBL, the Universal Business Language, currently being standardized through OASIS (the Organization for the Advancement of Structutured Information Standards). The goal of UBL is to represent all different sorts of business documents—purchase orders, invoices, order confirmations, and so on—using a family of XML vocabularies. XForms is a great tool with which to create and edit UBL documents.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
A Real-World Example
As an example, this section will develop an XForms solution for creating and editing a UBL purchase order. The first step is to define the initial instance data, which is a skeleton XML document that contains the complete structure of the desired final document, but with only initial data. This document serves as a template for newly-created purchase orders, and provides a framework on which to hang the rest of the form.
This complete example form is available online at http://dubinko.info/writing/xforms/ubl/.
Example 2-1 shows what a UBL purchase order document looks like. Figure 2-1 shows, in the X-Smiles browser, an XForms document capable of creating such a document.
Figure 2-1: An XML purchase order being created with XForms
Example 2-1. An XML purchase order using UBL
<Order xmlns="urn:oasis:names:tc:ubl:Order:1.0:0.70" 
xmlns:cat="urn:oasis:names:tc:ubl:CommonAggregateTypes:1.0:0.70">
  <cat:ID/>
  <cat:IssueDate/>
  <cat:LineExtensionTotalAmount currencyID="USD"/>
  <cat:BuyerParty>
    <cat:ID/>
    <cat:PartyName>
      <cat:Name/>
    </cat:PartyName>
    <cat:Address>
      <cat:ID/>
      <cat:Street/>
      <cat:CityName/>
      <cat:PostalZone/>
      <cat:CountrySub-Entity/>
    </cat:Address>
    <cat:BuyerContact>
      <cat:ID/>
      <cat:Name/>
    </cat:BuyerContact>
  </cat:BuyerParty>
  <cat:SellerParty>
    <cat:ID/>
    <cat:PartyName>
      <cat:Name/>
    </cat:PartyName>
    <cat:Address>
      <cat:ID/>
      <cat:Street/>
      <cat:CityName/>
      <cat:CountrySub-Entity/>
    </cat:Address>
  </cat:SellerParty>
  <cat:DeliveryTerms>
    <cat:ID/>
    <cat:SpecialTerms/>
  </cat:DeliveryTerms>
  <cat:OrderLine>
    <cat:BuyersID/>
    <cat:SellersID/>
    <cat:LineExtensionAmount currencyID=""/>
    <cat:Quantity unitCode="">1</cat:Quantity>
    <cat:Item>
      <cat:ID/>
      <cat:Description>Enter description here</cat:Description>
      <cat:SellersItemIdentification>
        <cat:ID>Enter part number here</cat:ID>
      </cat:SellersItemIdentification>
      <cat:BasePrice>
        <cat:PriceAmount currencyID="">0.00</cat:PriceAmount>
      </cat:BasePrice>
    </cat:Item>
  </cat:OrderLine>
</Order>
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Host Language Issues
The philosophy of the XForms specification can be summed up in a single line, found in the Abstract of the official W3C XForms document.
XForms is not a free-standing document type, but is intended to be integrated into other markup languages, such as XHTML or SVG.
This approach has benefits as well as drawbacks. The benefits are that the XForms specification was completed more quickly, and without host language dependencies that otherwise might exist. The primary disadvantage is that more work needs to be done to actually integrate XForms with XHTML, SVG, or any other language.
Another W3C specification, Modularization of XHTML, provides a framework in which XHTML, or any other combination of XML-based languages, can be mixed and matched in order to provide a combined document type. Such combinations can take advantage of specific language features; for example, in XHTML a non-rendered head section can contain the XForms Model, and in SVG, a foreignObject element can enclose individual form controls.
Any document that uses XForms will necessarily be a combined document type, involving multiple XML namespaces. Such compound documents are still largely uncharted territory in the realm of W3C specifications, which leads to several headaches. For one thing, XML has the concept of attributes of type ID, specifying a document-unique value. Unfortunately, the id-ness of the attribute needs to be declared in a DTD or some kind of schema, which can only occur at the top of the overall document, not at the point where a subdocument starts. DTDs in general are poorly suited to validation, so until further work is done within the W3C, some XForms documents will have to suffice with being simply well-formed.
Although often scorned by developers, XML namespaces are a fact of life, particularly for W3C specifications. XForms elements conforming to the final W3C Recommendation are defined in a namespace of
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Linking Attributes
Another attribute, src, has caused nearly as much controversy as its big brother in XHTML, href. The problem stems from tension with XLink 1.0, a W3C Recommendation, which asserts itself as the preferred technique to define any "explicit relationship between resources or portions of resources." Originally, this standard was envisioned by some as a solution that could apply to any XML, but the final solution worked only with an attribute named xlink:href (complete with a separate namespace).
The inflexibility of XLink causes problems in modularized documents, including XForms, since there are different kinds of links but only one allowed attribute name. As an example, an element might both serve as a launching point for a hyperlink, and at the same time link to external inline content, as in the following fragment that might result from a combination of XForms and SVG (which uses xlink:href):
<xforms:label src="label2.svg" xlink:href="homepage.html"/>
In this example, the src attribute from XForms points to a SVG file to be used as the label, and the xlink:href attribute from SVG makes the label a clickable hyperlink to homepage.html. It's a good thing that the XForms attribute is named src and not xlink:href, because a conflict would have resulted when trying to combine the languages, since an element can't have two attributes with the same name.
As an alternative to XLink, the HTML Working Group proposed another standard, called HLink, to annotate any XML with link descriptions. The proposal met with almost as little enthusiasm as XLink. The Technical Architecture Group (TAG) of the W3C is looking into the issue; the long term resolution remains to be seen. Controversies aside, in XForms, src consistently means one thing: that the URI in the attribute value is to be fetched as part of loading the document, and the contents rendered in place of whatever element contains the attribute (much like the img element in earlier versions of XHTML).
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 3: XPath in XForms
"Nobody trips over mountains. It is the small pebble that causes you to stumble. Pass all the pebbles in your path and you will find you have crossed the mountain."
—Traditional proverb
The most obvious difference between XForms and earlier technologies is the representation of form data as XML instead of flat name/value pairs. While a richer data representation was a welcome change, it also called for a more sophisticated language to reference structured data. The W3C had already defined just such a language, called XPath (http://www.w3.org/TR/xpath), a component of XSLT (http://www.w3.org/TR/xslt), an XML vocabulary used for transforming one flavor of XML into another. The XPath specification was built with the intention that later specifications could use it as a foundation, which is exactly what XForms does. This chapter first lays out the foundation of XPath, and then shows how XForms builds on that foundation.
What exactly is XPath? The "path" portion of the name comes from the similar appearance of many XPath expressions to directory paths in a filesystem, as shown in Example 3-1. XPath also includes some lightweight calculation functionality, such as basic mathematics, rounding, and string manipulation, which the calculation engine in XForms takes advantage of instead of defining a new (and incompatible) language.
Example 3-1. Some XPath expressions
/html/head/title
html:head/xforms:model/@xml:id
../items
purchaseOrder/items/item[3]
purchaseOrder/items/item[@price = 12.34]
string-length('hello world')
purchaseOrder/subtotal * instance('taxtable')/tax
total * instance('taxtable')/rate
Each of these examples demonstrates a particular aspect of how XPath is used for addressing parts of an XML document. But must the XML always exist as a distinct document? No. The data structure addressed by XPath is carefully defined—by the XPath Data Model. Detailed knowledge of the data model isn't required to start using XPath, though. A few basic concepts are all that is needed to begin.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Getting Up to Speed with XPath
The remainder of this chapter after this section serves as a detailed XPath reference. In many cases, however, only a basic level of XPath is needed in XForms. (Chapter 10 shows one common design pattern for forms that requires virtually no special XPath knowledge.) If you are new to XPath, this section will provide the necessary background that will enable you to read and write simple XPath expressions with confidence.
Simple XPath expressions resemble file system paths, except that instead of navigating across directories and files, XPath expressions navigate across XML nodes—the XPath term for any individual piece of XML such as an element, attribute, or piece of text. For example, the expression:
/html/head/title
represents an absolute path through XML, starting at a special root node, then progressing through child elements html, head, and title. The XML referenced by this path might look something like this:
<html>
  <head>
    <title>Push Button Paradise</title>
...
Since XML names can be qualified with a namespace, it's also possible to use colonized names at any step. Relative paths are also possible, in which case it's important to know what the context node (similar in concept to the current directory) is. Additionally, attributes can be addressed with a leading @ character, leading to XPath expressions like this:
html:head/xforms:model/@id
Note that when the leading slash is omitted, the path expression is relative.
Path expressions can be said to return a node-set. Both of the above examples conveniently returned a node-set consisting of a single node, but in the general case, node-sets can have zero, one, or a multitude of nodes. XForms includes a first node rule, that in certain circumstances, will reduce a larger node-set down to a single node, namely, the first one according to the order the elements appear in the document. Also, node-sets can be filtered manually using a
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Going Deep: The XPath Data Model
In the XPath view of things, elements, attributes, text, comments, processing instructions, and even namespaces are represented internally as nodes connected in a tree shape. Some nodes, such as elements, may have child nodes, while others, such as attributes, have no children, as restricted by XML rules. A special node, called the root node, serves as the ultimate ancestor node.
Example 3-2 shows a short XML document, and Figure 3-1 shows how that document would be represented by a tree of nodes.
In this example, note that neither the XML declaration nor the DOCTYPE declaration produce any nodes. Thus, these XML data structures are effectively invisible to XPath and, by extension, XForms. On the other hand, notice how each element node has two namespace nodes attached: one for the xmlns:html declaration on the root element, which applies throughout, and one for the built-in declaration of the xml prefix, as seen in the attribute xml:lang. Even a short document like this generates a huge number of nodes!
Example 3-2. A basic XML document, represented as text
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
    "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<?xml-stylesheet href="screen.css" type="text/css" media="screen"?>
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
  <head>
    <title>Virtual Library</title>
  </head>
  <body>
    <p>Moved to <a href="http://vlib.example.org/">vlib.example.org</a>.</p>
  </body>
</html>
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Location Paths
A key requirement for dissecting nearly any XPath expression is an understanding of Location Paths, which select one or more nodes based on their location or other properties. A Location Path consists of a number of individual Location Steps, each separated by a slash (/). Each individual step builds upon the previous steps to traverse the document, and can be a test against the name of a node, or one of the following special tests:
node( )
Matches any node whatsoever.
text( )
Matches any text node.
comment( )
Matches any comment node.
processing-instruction( )
Matches any processing instruction node, and may have a parameter to match against a specific processing instruction.
Another special test is *, which will match any element node (or attribute node within the attribute axis, or namespace node within the namespace axis.) Similarly, another special test prefix:* will match any node identified with the namespace mapped to prefix.
Figure 3-2 illustrates how a path is traversed in steps, from left to right.
Figure 3-2: Location paths and steps
Extra care is needed when traversing a document that contains XML namespaces, especially with defaulted namespaces. Any namespace prefixes in scope can be used in Location Steps; however,
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Computed Expressions
The preceding section discussed how addressing works in XPath. The language is also capable of performing simple computations that return strings, booleans, or numbers. XForms defines the term Computed Expression to represent this special usage of XPath.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
How XPath is Used in XForms
Given a good understanding of what XPath is and how it works, it's pretty simple to see how it fits into the XForms architecture.
Every use of XPath in XForms involves a context node, usually with the effect of shortening the number of steps needed in the path expression. For example, if the instance data is a simple XHTML document:
<html:html xmlns:html="http://www.w3.org/1999/html">
  <html:head>
    <html:title>Mutant Registration Guidelines</html:title>
  </html:head>
  <html:body>
    <html:p>The White House announced today...</html:p>
  </html:body>
</html:html>
the default context node is the element node named html:html. Thus, to bind a form control to the document title, instead of the longer absolute path /html:html/html:head/html:title, a shorter relative path html:head/html:title could be used. The key difference between the two is that absolute paths contain a leading slash and the name of the root element. Since there can be only one root element, including its name in every path expressions isn't necessary—path expressions don't become ambiguous by leaving out what is really a redundant step along the path.
The default context node can be changed. Any element containing a binding expressions resets the context node for any child elements. Binding expressions include the ref attribute possibly with the model attribute, or alternatively the bind attribute. Either way, the expression selects a node-set from the instance data, and the first node, in document order, is used as the context node for child elements. Chapter 10 shows a way to take advantage of this behavior to greatly simplify the use of XPath in XForms.
Within the markup for the XForms Model, there are a few things to be aware of regarding context nodes used for XPath expressions on the <bind> element. The nodeset attribute on this element selects an XPath node-set, applying certain properties such as calculate to each node. This means that the expression used will get evaluated multiple times, once for each node in the node-set. Upon each evaluation, the node being processed is the context node. This is most useful when a calculation appears in a repeating section. In the following example, each individual line item has a calculation that runs within the current line item only:
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 4: XML Schema in XForms
"Knowledge is of two kinds. We know a subject ourselves, or we know where we can find information on it."
—Samuel Johnson
Forms and datatypes always seem to be mentioned together. It's natural to think of data entry in terms of specific types, such as date or phone number. Despite a feint in the opposite direction taken by earlier drafts, XForms incorporates the datatypes defined in W3C XML Schema. This chapter discusses these datatypes, and describes the general framework for describing and defining custom datatypes.
In describing a datatype, XML Schema distinguishes between a lexical space, or the data as it appears in XML, and a value space, or the data as it exists on an abstract level. In practice, many datatypes have a one-to-one mapping between the lexical space and the value space, so the distinction can seem a little academic. It is important, however, when there are equivalent representations for some value. For instance, the boolean datatype can represent true as either 1 or true, (and false as either 0 or false). Even though there are multiple possible representations, they both map to the underlying concept of trueness and falseness, respectively. This is important when comparing values; the value space is used as the basis for comparison.
Many observers have pointed out that the lexical representations of some XML Schema datatypes aren't very user friendly. As an example, the duration of a day and an hour is P1DT1H. From the perspective of the person filling out a form, this is complete gibberish. To work around this, XForms gives responsibility to individual form controls to present data to the user in a manner that's convenient to the intended audience. Thus, XForms introduces (but doesn't specifically name) a third space, the user space. For the benefit of users, this might not be a straightforward mapping—the form control can have great latitude in rearranging things, such as a graphical calendar control to enter durations and dates.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Wide Open (Value) Spaces
In describing a datatype, XML Schema distinguishes between a lexical space, or the data as it appears in XML, and a value space, or the data as it exists on an abstract level. In practice, many datatypes have a one-to-one mapping between the lexical space and the value space, so the distinction can seem a little academic. It is important, however, when there are equivalent representations for some value. For instance, the boolean datatype can represent true as either 1 or true, (and false as either 0 or false). Even though there are multiple possible representations, they both map to the underlying concept of trueness and falseness, respectively. This is important when comparing values; the value space is used as the basis for comparison.
Many observers have pointed out that the lexical representations of some XML Schema datatypes aren't very user friendly. As an example, the duration of a day and an hour is P1DT1H. From the perspective of the person filling out a form, this is complete gibberish. To work around this, XForms gives responsibility to individual form controls to present data to the user in a manner that's convenient to the intended audience. Thus, XForms introduces (but doesn't specifically name) a third space, the user space. For the benefit of users, this might not be a straightforward mapping—the form control can have great latitude in rearranging things, such as a graphical calendar control to enter durations and dates.
XML Schema uses a divide-and-conquer technique to define datatypes. Each datatype can be broken down into a number of facets, each of which constrains some particular part of the allowed value space for that datatype. (One important exception is the pattern facet, which works on the lexical space.)
It's possible to take an existing datatype and trim it down to exactly meet your needs. This is called derivation by restriction, and entails changing one or more facets in the datatype. For example, the following XML Schema fragment limits the length of a string to 50 characters:
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Useful Datatypes
The following useful datatypes are either part of the XForms specification, or included from XML Schema. For each datatype, this section describes where it is defined, what conformance level of XForms it applies to, how the datatype is useful, caveats, and one or more examples.
xs:string
Defined in: XML Schema part 2
As the least-restricted datatype, xs:string is the default datatype that XForms will use, unless the author specifies otherwise.
Caveats
xs:string punts on all whitespace processing, so all tab characters and newline characters pass through unchanged. If this is undesired, it is better to use a more restricted datatype such as xs:normalizedString or xforms:listItem.
Example
  • Hello, World
xs:normalizedString
Defined in: XML Schema part 2
The only difference between this datatype and xs:string is that all whitespace characters are converted into space (0x20) characters.
Caveats
Whitespace is normalized, but not collapsed. Thus, it is still possible for multiple consecutive whitespace characters to exist.
Example
  • Hello, World
xs:language
Defined in: XML Schema part 2
If a form collects the name of a human language, this is the datatype to use.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Other Datatypes
The datatypes in the following list are less useful in forms, except perhaps in unusual circumstances. Nevertheless, they are a part of XForms, and are included here for completeness. Besides, someone might discover new ways to use these datatypes.
xs:float
Defined in: XML Schema part 2
The datatype xs:double can do anything xs:float can do and more. If you need to capture floating point values, use xs:double.
xs:duration
Defined in: XML Schema part 2
As specified, many duration comparisons are indeterminate. For example, is a month equal to 30 days? The answer varies from month to month. Because of this, XForms suggests against using xs:duration, except as an abstract base type for xforms:dayTimeDuration and xforms:yearMonthDuration. These derived types should always be used instead of xs:duration.
"gHorribleKluge"
xs:gYearMonth, xs:gYear, xs:gMonthDay, xs:gDay, xs:gMonth
Defined in: XML Schema part 2
These datatypes, thanks to their generally awkward natures, have collectively been christened "gHorribleKluge" by folks on the xml-dev mailing list. Very few XML documents are defined using these datatypes, which use a truncated representation of the ISO 8601 representation embodied in xs:date.
xs:hexBinary
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
An Email Datatype for XForms
One of the great disappointments in the XForms specification is the lack of a defined datatype for email—the one datatype common to nearly every Web form. Even if the specification doesn't define an email datatype, form designers still can. Getting all the details right is a little tricky, though. Since regular expressions aren't a programming language, there's no way to define a common recurring segment, and the regular expression tends to get a little repetitive. Taken one step at a time, however, it makes perfect sense. The datatype definition conforming to RFC 2822 is:
<xs:simpleType name="email">
  <xs:restriction base="xs:string">
    <xs:pattern value="[A-Za-z0-9!#-'\*\+\-/=\?\^_`\{-~]+
        (\.[A-Za-z0-9!#-'\*\+\-/=\?\^_`\{-~]+)*@[A-Za-z0-9!#-
        '\*\+\-/=\?\^_`\{-~]+(\.[A-Za-z0-9!#-'\*\+\-/=\?\^_`\
        {-~]+)*"/>
  </xs:restriction>
</xs:simpleType>
The main achievement in this lengthy statement is the definition of what the email address specification calls atext, which is defined alpha characters, digits, or one of the following characters:
"!" "#" "$" "%" "&" "'" "*" "+" "-" "/" "=" "?" "^" "_" "`" "{" "|" "}" "~"
In regular expression syntax, the definition for a single character of atext looks like this:
[A-Za-z0-9!#-'\*\+\-/=\?\^_`\{-~]
Note that the character ranges in this expression prevent it from being even bulkier, and that a number of the characters used need to be escaped. If you compare this with the entire regular expression given earlier, you will see that this definition repeats four times overall. If regular expressions had a way to define a commonly-recurring string, the regular expression might look like this (with spaces added for readability):
atext+ (\. atext+)* @ atext+ (\. atext+)*
But alas, the actual regular expression needs to repeat the full definition of atext four times, yielding the full definition of the email datatype.
This datatype definition is available online, so any XForms definition that includes the XML Schema at
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Complex Types
In XML Schema, a complex type is a datatype definition that can include element structure and attributes, which makes possible a number of additional (and, yes, complex) features including substitution groups, redefinition, and complex derivation. XForms includes the whole of XML Schema, though an easier-to-process profile that will leave out the complicated parts is still under development.
In many cases, the XML that will be processed by XForms already has a pre-existing XML Schema. In such cases it makes good sense to re-use the Schema by referencing it from the XForms Model. By doing so, additional datatype information will be made available to XForms.
If a Schema doesn't already exist, however, it's generally better to define only the minimum needed datatypes and leave it at that. The main reason for this recommendation is that many devices—those that adhere to a simpler XForms profile—will ignore all these more complicated features, and that any form that relies on them will produce different answers, depending on whether an XForms Full or XForms Basic device is accessing them.
For example, a Schema might define a coarse complexType that gets redefined into one with stricter validations. These stricter validations won't be seen by XForms Basic, and thus the unsuspecting person filling out the form might enter wrong values and have no idea they're even making a mistake.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
xsi:type
One contested feature of XML Schema is an attribute named xsi:type, which can be placed directly on XML instance data elements, even if a pre-existing Schema doesn't permit the attribute. For existing XML that uses this with simpleTypes, such as those described earlier in this chapter, this is a reasonable course. If the xsi:type identifies a complexType, however, all the problems in the previous section apply. For new development, the less intrusive XForms type model item property should be used, as described at Chapter 5.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 5: The XForms Model
ARTHUR: Camelot!
GALAHAD: Camelot!
LANCELOT: Camelot!
PATSY: It's only a model.
ARTHUR: Shhh!
—Monty Python and the Holy Grail
The term data model is probably one of the most terrifying and confusing terms to ever get written in a Web specification. That's why the XForms specification goes to great lengths to avoid that term. Instead, XForms Model is the name given to the form description. That name was chosen mainly because it wasn't "data model," but also to evoke thoughts of the Model-View-Controller (MVC) design pattern in programming. In MVC, a model contains all the essential data, and one or more views provide a viewpoint to examine or interact with the data. The XForms Model is analogous to a MVC model, and form controls, covered in Chapter 6, serve the function of views. (There's nothing that directly maps to a controller in XForms, though portions of the processing model and XForms Events play a similar role.)
XForms is based on a foundation data model, but you won't find it defined anywhere in the XForms specification. Instead, the XForms data model subsumes the XPath data model, which maps nodes to various structures in XML: elements, attributes, text, comments, processing instructions, namespaces, and a special node representing the document root. Chapter 3 describes this data model in great detail. This data model, resulting from parsed XML, is the source of nodes used in XForms.
A later section of this chapter describes the instance element, which can either point to or directly contain XML. Either way, this XML is parsed to create nodes in the instance data. (Another possibility is during "lazy author" processing, where the instance data nodes are built from scratch, without need of any author-provided XML.) The distinction between instance and instance data is subtle; a good comparison might be between the hard markup in a web page, as seen with the View Source command, and the in-memory representation accessible from the DOM. In nearly every case, XForms works from the internal instance data, ignoring the document markup. As a consequence, selecting View Source in the browser will always show the document as it was when it initially loaded, and any changes made because of XForms activity won't be visible.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Will the Real Data Model Step Forward?
Content