Use Character and Entity References
Not all characters are available on the keyboard! This hack shows you how to represent such characters in an XML document by using decimal and hexadecimal character references, and how to represent entities by using entity references.
In XML, character and entity references are formed by surrounding a
numerical value or a name with & and
;—for example, ©
is a decimal character reference and © is
an entity reference. This hack shows you how to use both.
Character References
According to the third and latest edition of the XML 1.0 specification (http://www.w3.org/TR/REC-xml/), XML processors must accept over 1,000,000 hexadecimal characters (http://www.w3.org/TR/REC-xml/#charsets). It’s possible that you won’t be able to find all those characters on your keyboard! Don’t worry. You can use character references instead.
Tip
You can look up the semantics of individual Unicode characters at http://www.unicode.org/charts/.
You can reference characters using either decimal or hexadecimal numbers. Which one you use is a matter of style. The document Namen.xml uses both (Example 1-5); it contains some German names enclosed in German language tags.
Example 1-5. Namen.xml
<?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet href="Namen.css" type="text/css"?> <Namen xml:lang="de"> <Name> <Vorname>Marie</Vorname> <Nachname>Müller</Nachname> <Geschlecht>♀</Geschlecht> </Name> <Name> <Vorname>Klaus</Vorname> <Nachname>Müller</Nachname> ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access