
2
|
Chapter 1: CJKV Information Processing Overview
ere is no universally recognized or accepted CJKV encoding method such as ASCII •
encoding—again, the various Unicode encoding forms have become the most widely
used encodings, for OSes, applications, and for web pages.
ere is no universally recognized or accepted input device such as the • QWERTY
keyboard array—this same keyboard array, through a method of transliteration, is
frequently used to input most CJKV text through reading or other means.
CJKV text can be written horizontally or vertically, and requires special typograph-•
ic rules not found in Western typography, such as spanning tabs and unique line-
breaking rules.
Learning that the ASCII character set standard is not as universal as most people think
is an important step. You may begin to wonder why so many developers assume that
everyone uses ASCII. is is okay. For some regions, ASCII is sucient. Still, ASCII has
its virtues. It is relatively stable, and it forms the foundation for many character sets and
encodings. UTF-8 encoding, which is the most common encoding form used for today’s
web pages, uses ASCII as a subset. In fact, it is this characteristic that makes its use pre-
ferred over the other two Unicode encoding forms, specically UTF-16 and UTF-32.
Over the course of reading this chapter, you will encounter several sections that explain
and illustra ...