Search the Catalog
CJKV Information Processing

Foreword to CJKV Information Processing

by Jack Halpern, Editor in Chief,
CJK Dictionary Publishing Society (CDPS)

In September 1993, an important event in the history of Japanese computing took place: the publication of Ken Lunde's "Understanding Japanese Information Processing." Even today, no other book brings together such a wealth of information on Japanese data processing. I and my colleague Takeo Suzuki had the unique honor to work on the Japanese translation of this book, a work that made worldwide impact.

Today, an event of even greater importance-a major milestone in the history of East Asian computing-is unfolding: the publication of Ken Lunde's new book: "CJKV Information Processing." No other book even pretends to approach it in either content or quality. Though the range of issues covered is broad, the treatment of each topic is comprehensive and in-depth.

With the recent spread of Unicode as an international character set and the growing importance of East Asia in the world economy, new CJK applications are appearing at an increasingly rapid pace. Many software publishers are "jumping on the CJK wagon," with the hopes of tapping the potentially huge East Asian market, which has mostly remained closed to non-Asian software developers.

An important reason for the poor market penetration of these products is their low quality. Many of these applications are simply too primitive to adequately meet the practical needs of users.

This state of affairs can be explained by three interrelated factors. First, this is a young market, and developers have not yet acquired sufficient skills and experience. For example, some developers of Japanese software, who have little or no knowledge of the language, hire outside help such as students, often with disastrous results.

Second, developers often do not have access to high-quality data, especially dictionary data, required for the all-important input method (also known as "front-end processor" or FEP) development. It is easy enough to find such data by surfing the Web, but it requires much skill to eliminate the countless errors and adapt it to specific needs.

Third is the lack of good information on CJKV processing. The issues are complex. The developer must contend with multiple, mostly incompatible encoding systems, different character sets, a bewildering variety of locale-dependent input methods, code conversion between incompatible character sets, and support for Unicode, to mention but a few.

How does one acquire reliable and detailed information on these issues? Until the appearance of the present work, this was well-nigh impossible. For the first time, Ken Lunde's pioneering work provides nothing less than an inexhaustible source of accurate and complete information on every aspect of CJKV data processing.

Let me illustrate how useful this information was to the dictionary projects of the Kanji Dictionary Publishing Society (KDPS). We have recently completed "The Kodansha Kanji Learner's Dictionary" (Kodansha International, 1998), based on the "New Japanese-English Character Dictionary" (Kenkyusha, 1990; NTC, 1993) of which I am the chief editor. At the same time, we have been developing DESK, a comprehensive CJK database from which dozens of dictionaries, CJK FEP data, and learning aids are being developed.

Though we are dedicated CJK specialists, the information in Ken Lunde's book was invaluable every step of the way. For example, when we created outline fonts for some 1,350 user-defined characters, it helped us decide on encoding ranges and methods. When we switched to a new platform, it helped us write code conversion routines and build function libraries. When we developed our dictionary page composition system, it guided us in the purchase of software and taught us in depth about typography and font technology. And so on and so on.

As can be seen from our example, the aim of this book is highly practical. The author has a very full grasp of the real needs of such diverse users as software developers, lexicographers, and language learners, and provides detailed information for each need with great clarity and precision. I am fully confident that this book shall become an invaluable source of information to everyone interested in CJKV information processing.

Jack Halpern
Editor in Chief, CJK Dictionary Publishing Society (CDPS)
http://www.cjk.org/

Back to: Sample Chapter Index


oreilly.com Home | O'Reilly Bookstores | How to Order | O'Reilly Contacts
International | About O'Reilly | Affiliated Companies | Privacy Policy

© 2001, O'Reilly & Associates, Inc.