Appendix F. ASCII Table
Gone are the days when ASCII meant just US-ASCII characters 0-127. For over a decade now, Latin-1 support (US-ASCII plus characters 160-255) has been the bare minimum for any Internet application, and support for Unicode (Latin-1 plus characters 256 and up) is becoming the rule more than the exception. Although a full Unicode character chart is a book on its own, this appendix lists all US-ASCII characters, plus all the Unicode characters that are common enough that the current HTML specification (4.01) defines a named entity for them.
Note that at time of this writing, not all browsers support all these characters, and not all users have installed the fonts needed to display some characters.
Also note that in HTML, XHTML, and XML, you can refer to any Unicode
character regardless of whether it has a named entity (such as €
) by using a decimal character
reference such as €
or a
hexadecimal character reference such as €
(note the leading x
). See http://www.unicode.org/charts/ for a complete
reference for Unicode characters.
Dec | Hex | Char | Octal | Raw encoding | UTF8 encoding | HTML entity | Description |
0 | 0000 | 000 | 0x00 | 0x00 | NUL | ||
1 | 0001 | 001 | 0x01 | 0x01 | SOH | ||
2 | 0002 | 002 | 0x02 | 0x02 | STX | ||
3 | 0003 | 003 | 0x03 | 0x03 | ETX | ||
4 | 0004 | 004 | 0x04 | 0x04 | EOT | ||
5 | 0005 | 005 | 0x05 | 0x05 | ENQ | ||
6 | 0006 | 006 | 0x06 | 0x06 | ACK | ||
7 | 0007 | 007 | 0x07 | 0x07 | BEL, bell, alarm, \a | ||
8 | 0008 | 010 | 0x08 | 0x08 | BS, backspace, \b | ||
9 | 0009 | 011 | 0x09 | 0x09 | HT, tab, \t | ||
10 | 000a | 012 | 0x0A | 0x0A | LF, line feed, \cj | ||
11 | 000b | 013 | 0x0B | 0x0B | VT | ||
12 | 000c | 014 | 0x0C | 0x0C | FF, NP, form feed, \f | ||
13 | 000d | 015 | 0x0D |
Get Perl & LWP now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.