
Miscellaneous Algorithms
|
591
e rst algorithm is for the automatic detection of the input le’s encoding. Some so-
ware requires that you specify the encoding method used by the input le: many people
who use Japanese code conversion utilities may not be familiar with the various Japanese
encoding methods. If you do not know, all you can do is guess. e Japanese code detec-
tion algorithm examines the input le in order to determine the encoding method. is
usually makes it unnecessary to specify the input le’s encoding method. However, there
are times when the input le’s encoding may be ambiguous—the Shi-JIS and EUC-JP
encoding ranges overlap considerably, for example.
e second algorithm converts half-width katakana into their full-width counterparts.
Some environments do not provide half-width katakana support, so this algorithm con-
verts these characters into their full-width versions, which are more commonly support-
ed. is algorithm is also quite useful as a lter for outgoing email transmissions in order
to ensure that information interchange is maintained on the receiving end.
e third and nal algorithm repairs damaged ISO-2022-JP–encoded les; that is, les
that had their escape sequences stripped out by unfriendly email or news reading so-
ware. I occasionally received email in this damaged format, and spent a lot of time rein-
serting those lost escape characters. ...