HTML4 Entity Sets
HTML 4.0 predefines several hundred named entities, many of which
are quite useful. For instance, the nonbreaking space is
. XML, however, defines only five
named entities:
&
The ampersand (
&
)<
The less-than sign (
<
)>
The greater-than sign (
>
)"
The straight double quote (“)
'
The straight single quote (')
Other needed characters can be inserted with character
references in decimal or hexadecimal format. For instance, the
nonbreaking space is Unicode character 160 (decimal). Therefore, you
can insert it in your document as either  
or  
. If you really want to type it as
, you can define this
entity reference in your DTD. Doing so requires you to use a character
reference:
<!ENTITY nbsp " ">
The XHTML 1.0 specification includes three DTD fragments that define the familiar HTML character references:
- Latin-1 characters (http://www.w3.org/TR/xhtml1/DTD/xhtml-lat1.ent)
The non-ASCII, graphic characters included in ISO-8859-1 from code points 160 through 255, shown in Table 27-3
- Special characters (http://www.w3.org/TR/xhtml1/DTD/xhtml-special.ent)
A few useful letters and punctuation marks not included in Latin-1
- Symbols (http://www.w3.org/TR/xhtml1/DTD/xhtml-symbol.ent)
The Greek alphabet, plus various arrows, mathematical operators, and other symbols used in mathematics
Feel free to borrow these entity sets for your own use. They should be included in your document’s DTD with these parameter entity references and PUBLIC identifiers: ...
Get XML in a Nutshell, 3rd Edition now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.