Entities

The final bit of WML syntax you need before starting on the range of different elements is the entity . You may recognize entities if you’ve ever had to put certain special symbols (quotes, greater than and less than signs, and several others) into an HTML page. Their purpose is to represent symbols that either can’t easily be typed in (you may not have a British pound sign on your keyboard) or that have a special meaning in WML. (For example, if you put a < character into your text normally, the browser thinks it’s the start of a tag; the browser then complains when it can’t find the matching > character to end the tag.)

Table 1.1 displays the three forms of entities in WML. Named entities are something you may be familiar with from HTML: they look like &amp; or &lt;, and they represent a single named character via a mnemonic name. Entities can also be entered in one of two numeric forms (decimal or hexadecimal), allowing you to enter any Unicode character into your WML. (This doesn’t guarantee that the browser can display it, but at least you can try.) Decimal numeric entities look like &#33; (Unicode exclamation mark) or &#163; (Unicode pound sign). Hexadecimal numeric entities look like &#x21; or &#xA3; for the same two characters (note that 33 decimal is 21 hexadecimal, and 163 decimal is A3 hexadecimal).

Table 1-1. Named Entities and Their Equivalents

Named Entity

Decimal Entity

Hexadecimal Entity

Character

&quot;

&#34;

&#x22;

Double quote (")

&amp;

&#38;

&#x26;

Ampersand (&)

&apos;

&#39;

&#x27;

Apostrophe (')

&lt;

&#60;

&#x3C;

Less than (<)

&gt;

&#62;

&#x3E;

Greater than (>)

&nbsp;

&#160;

&#xA0;

Nonbreaking space

&shy;

&#173;

&#xAD;

Soft hyphen

Note that all entities start with an ampersand ( &) and end with a semicolon ( ;). This semicolon is very important: some web pages forget this and cause problems for browsers that want correct HTML (most web browsers are forgiving about slightly incorrect HTML syntax, so many common errors slip through). WAP browsers are likely to be stricter about errors like these.

The last two entities in the table may require some explanation. When the browser needs to break a long line of text in order to fit it onto the screen, it looks for a suitable point at which to break, such as the gap between two words. Normally, this means that lines are broken at spaces.

A nonbreaking space is a special kind of space that doesn’t mark a word boundary, and so the browser doesn’t break the line there. Nonbreaking spaces are useful when the characters surrounding the space are not normal English text. In some computer typesetting systems, they are also used to make the line breaks in long passages of text fall in places that make the text easier to read, but this is unlikely to be of use with WAP.

Soft hyphens are also linked to line breaking, but instead of preventing a break, they mark a place in a long word where a break is permissible (a discretionary hyphen in computer-typesetting parlance). The hyphen is displayed only if the line is broken at that point.[7]



[7] Entities and their different forms are yet another XML feature in WML, although XML allows them to be more complicated than this (you really don’t want to know). HTML users may know that there are many more entities available in HTML, such as &copy; for a copyright symbol, but WML requires that any beyond the few provided be entered using the numeric forms.

Get Learning WML, and WMLScript now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.