Cover | Table of Contents | Colophon
http://www.oreilly.com/catalog/xmlhks/. These
hacks assume that you have extracted this archive into a working
directory where you can exercise the examples.1. <?xml version="1.0" encoding="UTF-8"?> 2. 3. <!-- a time instant --> 4. <time timezone="PST"> 5. <hour>11</hour> 6. <minute>59</minute> 7. <second>59</second> 8. <meridiem>p.m.</meridiem> 9. <atomic signal="true" symbol="◑"/> 10. </time>
http://www.oreilly.com/catalog/xmlhks/. These
hacks assume that you have extracted this archive into a working
directory where you can exercise the examples.1. <?xml version="1.0" encoding="UTF-8"?> 2. 3. <!-- a time instant --> 4. <time timezone="PST"> 5. <hour>11</hour> 6. <minute>59</minute> 7. <second>59</second> 8. <meridiem>p.m.</meridiem> 9. <atomic signal="true" symbol="◑"/> 10. </time>
version="1.0"). Currently,
XML Version 1.0 is in the broadest use,
but Version 1.1 is also now available (http://www.w3.org/TR/xml11/), so
1.1 is also a possible value for
http://www.microsoft.com/windows/ie/)http://www.mozilla.org)http://channels.netscape.com/ns/browsers/download.jsp)http://www.opera.com)http://www.apple.com/safari/)http://msdn.microsoft.com/vstudio/), you can
use the Resource Editor to edit and save this stylesheet back in the
DLL (http://netcrucible.com/xslt/msxml-faq.htm#Q19).http://www.w3.org/Style/CSS/). CSS Level 1 or
CSS/1 (http://www.w3.org/TR/CSS1)
came out of the W3C in 1996 and was later revised in 1999. CSS Level
2 or CSS/2 (http://www.w3.org/TR/CSS2/) became a W3C
recommendation in 1998. CSS/3 is under construction (http://www.w3.org/Style/CSS/current-work).
Understandably, CSS/1 enjoys the widest support.http://www.w3.org/TR/xml-stylesheet). The XML
stylesheet processing instruction is optional unless you are using a
stylesheet that you want to associate with an XML document in a
standard way.http://www.w3.org/TR/REC-xml#sec-pi).
Generally, PIs can appear anywhere that an element can appear,
although the XML stylesheet PI must appear at the beginning of an XML
document (though after the XML declaration, if one is present). The
beginning part of an XML document, before the document element
begins, is called a prolog.<?xml-stylesheet href="time.css" type="text/css"?>
<? and
?>. The term immediately following
<? is called the
target
.
The target identifies the purpose or name of the PI. Other than the
XML stylesheet PI, you can find PIs used in DocBook files
[Hack #62]
and in XML-format files used
by Microsoft Office 2003 applications, such as Word
[Hack #14]
and Excel
[Hack #15]
.& and
;—for example, ©
is a decimal character reference and © is
an entity reference. This hack shows you how to use both.http://www.w3.org/TR/REC-xml/), XML
processors must accept over 1,000,000 hexadecimal characters
(http://www.w3.org/TR/REC-xml/#charsets).
It's possible that you won't be
able to find all those characters on your keyboard!
Don't worry. You can use character references
instead.http://www.unicode.org/charts/.<?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet href="Namen.css" type="text/css"?> <Namen xml:lang="de"> <Name> <Vorname>Marie</Vorname> <Nachname>Müller</Nachname> <Geschlecht>♀</Geschlecht> </Name> <Name> <Vorname>Klaus</Vorname> <Nachname>Müller</Nachname> <Geschlecht>♂</Geschlecht> </Name> </Namen>
ü and ♀,
respectively. The first one refers to the letter u with an umlaut
(ü) and the second one is a female sign. Lines 12 and 13
use the hexadecimal character references
ü (ü) and
♂ (male sign), respectively. You can
see how these character references are rendered in Opera in Figure 1-6.http://tucows.com/htmltext95_default.html for
examples of other text editors).http://www.vim.org) is a derivative of the
Unix screen editor, Vi. It is currently at Version 6.3 and is
developed under the leadership of Bram Moolenaar. You can get flavors
of Vim that run on Unix (such as Red Hat, Sun Solaris, or Debian),
Windows, MS-DOS, the Mac, OS/2, and even Amiga (downloads available
at http://www.vim.org/download.php). If you are
running recent versions of Red Hat (http://www.redhat.com) or Cygwin for Windows
(http://www.cygwin.com), you
likely already have Vim installed on your system.http://www.cs.pdx.edu/~kirkenda/joy84.html).
Vi was the first screen editor I ever used—back in
1983—and I still use Vim almost every day. Vim is powerful, and
without elaborating on all the reasons why I like to use Vim, I will
mention just one: syntax highlighting.http://www.xmlsoftware.com/editors.html for a
comprehensive though not exhaustive list), but I'll
mention only a few safe bets here.http://www.xmlspy.com) is a feature-rich,
graphical editor for XML for the Windows environment. xmlspy has also
been tested on Red Hat Linux running Wine, and Mac OS/X running
Microsoft Virtual PC for Mac. The Home Edition of this popular editor
is available for free, but you must pay for licensess for
Professional and Enterprise editions. I'll give you
a quick feature fly-over of xmlspy—though there are a number of
features I won't get around to mentioning.http://www.w3.org/TR/wsdl) and SOAP
[Hack #63]
. You can also use xmlspy to
generate Java, C++, or C# code
[Hack #99]
from DTDs or XML Schema
documents.
http://www.w3.org/TR/REC-xml/). This syntax
mandates such things as matching case in tag names, matching quotes
around attribute values, restrictions on what Unicode characters may
be used, and so on.<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE time SYSTEM "time.dtd">
<!-- a time instant -->
<time timezone="PST">
<hour>11</hour>
<minute>59</minute>
<second>59</second>
<meridiem>p.m.</meridiem>
<atomic signal="true"/>
</time>http://www.xml.com/pub/a/tools/ruwf/check.html)—which
is implemented in Perl using XML::Parser (http://www.perl.com/pub/a/1998/11/xml.html).
RUWF accepts a URL for an XML document or allows you to paste an XML
document into a text box.http://www.wyeast.net/time.xml,
by entering the URL into the "Your
URL" text box.)
http://www.cogsci.ed.ac.uk/~richard/xml-check.html)
or from the command line
[Hack #8]
.http://www.wyeast.net/time.xml. Figure 1-16 shows you how to check this document for
well-formedness using the online version of RXP. Enter the URL in the
text box, and then click the button labeled "check
it."http://www.w3.org/TR/xml-c14n) in Figure 1-17. Canonical XML defines a method for outputting
XML in a consistent, reliable way, leaving some things behind in
output, such as the XML declaration and, optionally, comments.http://www.cogsci.ed.ac.uk/~richard/rxp.html.
For Windows and other platforms, you can download the C source and
compile it yourself (ftp://ftp.cogsci.ed.ac.uk/pub/richard/rxp.tar.gz)
or, if you are on Windows, you can simply download the executable
rxp.exe (ftp://ftp.cogsci.ed.ac.uk/pub/richard/rxp.exe).rxp time.xml
<?xml version="1.0" encoding="UTF-8"?>
<!-- a time instant -->
<time timezone="PST">
<hour>11</hour>
<minute>59</minute>
<second>59</second>
<meridiem>p.m.</meridiem>
<atomic signal="true"/>
</time>
-V option, provided it has an accompanying DTD (as
valid.xml does):rxp -V valid.xml
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE time SYSTEM "time.dtd">
<!-- a time instant -->
<time timezone="PST">
<hour>11</hour>
<minute>59</minute>
<second>59</second>
<meridiem>p.m.</meridiem>
<atomic signal="true"/>
</time>http://java.sun.com) has been a popular
object-oriented language since it was unveiled by Sun in the
mid-1990s. One key idea behind Java was that it made it possible to
write and compile a program once, and then run it on any machine that
supports a Java interpreter ("write once, run
anywhere"). Note that it's not a
perfect programming language—I've heard Ted
Ts'o (http://thunk.org/tytso/) say of Java,
"Write once, run screaming."http://java.sun.com/learning/new2java/ will
also help you get up to speed quickly.http://java.sun.com and find the link for the
Java VM download. (There are alternatives to Sun's
VM, such as one offered on http://www.kaffe.org/, but
I'm only going to talk about the Sun VM here.) In a
few clicks, the new VM will be downloaded to your machine. You should
then be able to go to a command prompt and type:java -version
java version "1.4.2_03" Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.2_03-b02) Java HotSpot(TM) Client VM (build 1.4.2_03-b02, mixed mode)
http://java.sun.com/j2se/1.4.2/install-windows.htmlhttp://www.oreilly.com/catalog/xmlhks/.http://www.oxygenxml.com/). I have chosen
<oXygen/> because it runs on multiple platforms, is inexpensive
(it has a free trial and its license is less than $100 USD), and
offers many useful features.http://www.oreilly.com/catalog/xmlhks/.http://www.oxygenxml.com/). I have chosen
<oXygen/> because it runs on multiple platforms, is inexpensive
(it has a free trial and its license is less than $100 USD), and
offers many useful features.hour
element is highlighted in both the outline and document panes).
Beneath the document pane is a tabbed pane that shows the result of a
transformation of valid.xml with XSLT.
http://www.thaiopensource.com/download/.)
nXML was developed by James Clark, the man who brought us groff,
expat, sgmls, SP, and Jade, as well as being a driving force behind
the development of XPath, XSLT (and before that, DSSSL), and, along
with Murata Makoto, RELAX NG (http://www.relaxng.org/).http://www.topologi.com/products/tme/)—it
provides real-time, automatic visual identification of validity
errors.M-x load-file
M-x nxml-mode
C-h m
" $HOME/.vimrc " Don't pretend to be vi set nocompatible " Turn on syntax highlighting syntax on " Indicate that we want to detect filetypes and want to run filetype " plugins. filetype plugin on
filetype
plug-in
. Vim will source this file
when it detects that you are editing an XML file (i.e., when the file
ends with the .xml suffix or if it has a proper
XML declaration). Example 2-2 is a good starter
ftplugin. Save it to your home directory as
.vim/after/ftplugin/xml.vim. (The file
xml.vim is in the book's file
archive.) The after segment of the path means that
it will be sourced after all the normal scripts, plug-ins, and so on
are sourced, which allows you to override defaults and other plug-ins
without changing the original scripts. That makes upgrading those
scripts easier." $VIMRUNTIME/after/ftplugin/xml.vim " Turn on auto-indentation set autoindent " Let's use a 2-character indent set shiftwidth=2 " With smarttab set, we can press tab at the beginning " of a line and get shiftwidth indent even though " tabstop is something else (e.g. the default 8) set smarttab " A lot of XML looks really bad and gets really confusing if " screen-wrapped. I prefer to turn off wrapping. set nowrap
hour element is highlighted; it is associated
through a mapping with cell B1, which is also highlighted. If you
were to select cell C1, the minute element in the
XML Source pane would be highlighted.
http://www.openoffice.org/), the
free, open source, multiplatform office application suite that
provides an alternative to Microsoft Office, uses a documented XML
format as its native file format. Put this together with OpenOffice
1.1's ability to read Word, Excel, and PowerPoint
files from Office 97, 2000, and XP, plus Word 6.0 files, Word 95
files, and Excel 4.0, 5.0, and 95 files, and you've
got a simple way to convert these files to XML.timezone;hour;minute;second;meridiem;atomic PST;11;59;59;p.m.;
;) delimits each of the fields. The second line
ends with a field containing a single space, which of course you can
see.timezone field
name in the first row so that it becomes an equals sign. This
specifies that the timezone field will be
interpreted as an attribute in the output. Then click OK.
import and row elements were
inserted by xmlspy; the remaining elements were derived from
time.txt. You could change the new document by
hand to match time.xml (from Chapter 1), or you could apply an XSLT stylesheet to
it. XSLT hacks begin in earnest in Chapter 3,
but I'll use an XSLT stylesheet here (without going
into detail about the stylesheet itself) to show you how to shape
this document up.http://www.dpawson.co.uk/java/uphill/), a
Java program for converting plain text into XML.http://www.python.org/) because Python has
dictionaries that can be preloaded. I had a list of acronyms that I
quickly converted into a Python structure to initialize a dictionary.
The match I used was:if acrs.has_key(str[i:i+4]):
USA:<acr>USA</acr>
USA is marked up with the
acr tag. I realized that some acronyms may be
generalized. If the first two letters can be captured, any remaining
uppercase letters were probably a part of the acronym. I came up with
this as an entry:BD:*
BD, I can keep on
looking for more uppercase letters, up until a terminal.java -jar uphill.jar
http://www.jclark.com/sp/. The examples in
this hack assume that SP has been installed in the working directory
for the book's files.
line,desc,quan,date 1,Oak chairs,6,31-Dec-04 2,Dining tables,1,31-Dec-04 3,Folding chairs,4,29-Dec-04 4,Couch,1,31-Dec-04 5,Overstuffed chair,1,30-Dec-04 6,Ottoman,1,31-Dec-04 7,Floor lamp,1,20-Dec-04 8,Oak bookshelves,1,31-Dec-04 9,Computer desk,1,31-Dec-04 10,Folding tables,3,31-Dec-04 11,Oak writing desk,1,28-Dec-04 12,Table lamps,5,26-Dec-04 13,Pine night tables,3,26-Dec-04 14,Oak dresser,1,30-Dec-04 15,Pine dressers,1,31-Dec-04 16,Pine armoire,1,31-Dec-04
http://www.dpawson.co.uk/java/index.html and
extract the JAR file CVSToXML.jar from the ZIP
archive and place it in the working directory. Enter this command:java -jar CSVToXML.jar
No property File available; Quitting
CSVToXML 1.0 from Dave Pawson
Usage: java CSVToXML [options] {param=value}...
Options:
-p filename Take properties from named file
-o filename Send output to named file
-i filename Take CSV input from named file
-t Display version and timing information
-? Display this messagehttp://www.w3.org/People/Raggett/#tidy).
Essentially, it's an open source HTML parser with
the stated purpose of cleaning up and pretty-printing HTML, XHTML,
and even XML. It is now hosted on Sourceforge (http://tidy.sourceforge.net). You can
download versions of Tidy for a variety of platforms there.<HTML> <HEAD><TITLE>Time</TITLE></HEAD> <BODY style="font-family:sans-serif"> <H1>Time</H1> <TABLE style="font-size:14pt" cellpadding="10"> <TR> <TH>Timezone</TH> <TH>Hour</TH> <TH>Minute</TH> <TH>Second</TH> <TH>Meridiem</TH> <TH>Atomic</TH> </TR> <TR> <TD>PST</TD> <TD>11</TD> <TD>59</TD> <TD>59</TD> <TD>p.m.</TD> <TD>true</TD> </TR> </TABLE> </BODY> </HTML>
-asxhtml
switch:<