
276
|
Chapter 4: Encoding Methods
CJKV encodings and the character sets they supportTable 4-78.
Encoding Character sets
HZ GB 1988-89
a
and GB 2312-80
GBK ASCII and GBK
GB 18030 ASCII, GBK, and GB 18030-2005
Big Five ASCII, Big Five, and Hong Kong SCS-2008
Big Five Plus ASCII, Big Five, and Big Five Plus
Shift-JIS JIS X 0201-1997
b
and JIS X 0208:1997
Johab KS X 1003:1993,
a
KS X 1001:2004, and additional hangul syllables
Or ASCII.a.
Or ASCII instead of the JIS-Roman portion.b.
ere are some character sets that are supported by a single encoding, such as GBK, GB
18030, Big Five, Big Five Plus, and Hong Kong SCS-2008. eir designations are unique
in that they can refer to either their character set or their encoding.
I should also point out and make clear that although “charset” obviously is a contrac-
tion of “character set,” its meaning is a combination of one or more character sets and an
encoding method.
Charset Registries
Now that the distinction between a character set and an encoding has been made clear,
the question becomes what designation is appropriate for the purpose of relaying content
information for documents, such as in the context of email, HTML, XML, and so on. In
my opinion, encoding names make the best charset designators because there is little or
no ambiguity or confusion as to what character sets they can support.
ere have been three primary registries for charset designator ...