PYX

PYX is an early XML stream solution that converts XML into character data compatible with text applications like grep, awk, and sed. Its name represents the fact that it was the first XML solution in the programming language Python. XML events are separated by newline characters, fitting nicely into the line-oriented paradigm of many Unix programs. Table 10-2 summarizes the notation of PYX.

Table 10-2. PYX notation

Symbol

Represents

(

An element start tag

)

An element end tag

-

Character data

A

An attribute

?

A processing instruction

For every event coming through the stream, PYX starts a new line, beginning with one of the five event symbols. This line is followed by the element name or whatever other data is pertinent. Special characters are escaped with a backslash, as you would see in Unix shell or Perl code.

Here’s how a parser converting an XML document into PYX notation would look. The following code is XML input by the parser:

<shoppinglist>
  <!-- brand is not important -->
  <item>toothpaste</item>
  <item>rocket engine</item>
  <item optional="yes">caviar</item>
</shoppinglist>

As PYX, it would look like this:

(shoppinglist
-\n
(item
-toothpaste
)item
-\n
(item
-rocket engine
)item
-\n
(item
Aoptional yes
-caviar
)item
-\n
)shoppinglist

Notice that the comment didn’t come through in the PYX translation. PYX is a little simplistic in some ways, omitting some details in the markup. It will not alert you to CDATA markup sections, although it will let the content pass through. Perhaps ...

Get Learning XML, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.