
Code Conversion Algorithms
|
577
functions include code conversion between major legacy encodings and Unicode encod-
ings, character classication (identication of a character), and character property con-
version (such as half- to full-width katakana conversion). Basis Technology also oers a
general-purpose code conversion utility, called Uniconv, built using this library.
Another well-made and well-established globalization library that deserves exploration
and strong consideration is ICU (International Components for Unicode), which is por-
table across many platforms, including Mac OS X.
*
ICU includes superb support for Uni-
code and works well with C/C++ and Java.
Code Conversion Algorithms
It is very important to understand that only the encoding methods for the national char-
acter sets are mutually compatible, and work quite well for round-trip conversion.
†
e
vendor-dened character sets oen include characters that do not map to anything mean-
ingful in the national character set standards. When dealing with the Japanese ISO-2022-
JP, Shi-JIS, and EUC-JP encodings, for example, algorithms are used to perform code
conversion—this involves mathematical operations that are applied equally to every char-
acter represented under an encoding method. is is known as algorithmic conversion.
However, dealing with Unicode encoding forms such as UTF-8, UTF-16, and UTF-32,
and when mappin ...