Entity References
Isolated markup characters (such as <, &, and >) are not permitted in the flow of text in an XML document and must be escaped using either a Numeric Character Reference
or a predefined character entity. This is to avoid having the XML parser interpret any < symbol as the beginning of a new tag. In addition to using entity references
in the content of the document, you must use them in attribute values.
XML defines five character entities for use in all XML languages, listed in Table 7-1. Other entities may be defined in a DTD.
Table 7-1. Predefined character entities in XML
|
Entity |
Char |
Notes |
|---|---|---|
|
& |
|
Must not be used inside processing instructions |
|
< |
|
Use inside attribute values quoted with |
|
> |
|
Use after |
|
" |
|
Use inside attribute values quoted with |
|
' |
|
Use inside attribute values quoted with |
If you have a document that uses a lot of special characters, such as an example of source code, you can tell the XML parser that the text is simple character data (CDATA) and should not be parsed. To protect content from parsing, enclose it in a CDATA section
, indicated by <![CDATA[ ... ]]>. This XHTML example uses a CDATA section to display sample markup on a web page without requiring every < and > character to be escaped:
<p>This is sample SMIL markup:</p><![CDATA[ <audio src="audio_file.mp3" begin="0s" /> <seq> <img src="image_1.jpg" begin="0s" /> <img src="image_2.jpg" ...