O'Reilly logo

XML Hacks by Michael Fitzgerald

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Use Elements Instead of Entities to Avoid the “amp Explosion Problem”

Use replaceable elements as a solution to the “amp explosion problem."

Search for the string & using your favorite search engine. Then search for the string & and then & and so on. You will get lots of hits and see lots of interesting text. Here are some examples I found:

  • Why Choose Auto & Home Insurance

  • poésie, nouvelles, théâtre

These strange incantations can be traced back to the entity structure of XML (and SGML before it). Simply put, XML provides a number of ways in which textual units, known as entities [Hack #25] , can be spliced into other textual units by an XML parser. The mechanism involves referring to these entities by name. The name is preceded by an ampersand character and followed by a semicolon.

Some of these entities are built into XML itself and thus are built into every XML parser. The five built-in entities (see Table 7-1) provide ways of encoding characters that would otherwise have special meaning to an XML parser because of their roles in markup .

Table 7-1. XML predefined entities

Entity reference

Description

<

Less-than sign (<)

&gt;

Greater-than sign (>)

&apos;

Apostrophe (')

&quot;

Quotation mark (“)

&amp;

Ampersand (&)

The troublesome entity here is the ampersand. Note that the escaped version of it features an ampersand character—the very character we are ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required