CDATA Sections
The golden rule of handling CDATA sections is
this: ignore them. When writing code to process XML, pretend
CDATA sections do not exist, and
everything will work just fine. The content of a CDATA section is plain text. It will be
reported to your application as plain text, just like any other
text, whether enclosed in a CDATA
section, escaped with character references, or typed out literally
when escaping is not necessary. For example, these two example elements are exactly the same as
far as anything in your code should know or care:
<example><![CDATA[<?xml version="1.0"?> <root> Hello! </root>]]></example> <example><?xml version="1.0"?> <root> Hello! </root></example>
Do not write programs or XML documents that depend on knowing
the difference between the two. Parsers rarely (and never reliably)
inform you of the difference. Furthermore, passing such documents
through a processing chain often removes the CDATA sections completely, leaving only
the content intact but represented differently—for instance, with
numeric character references representing the unserializable
characters. CDATA sections are a
minor convenience for human authors, nothing more. Do not treat them
as markup.
This also means you should not attempt to nest one XML (or
HTML) document inside another using CDATA sections. XML documents are not designed to nest inside one another. The correct solution to this problem is to use namespaces to sort out which markup is which, rather than trying ...