
Miscellaneous Algorithms
|
593
}
}
else if (c <= 159)
whatcode = SJIS; /* Shift-JIS detected. */
}
else if (c >= 240 && c <= 254)
whatcode = EUC; /* EUC-JP detected. */
else if (c >= 224 && c <= 239) {
c = fgetc(in); /* Read next byte to c. */
if ((c >= 64 && c <= 126) || (c >= 128 && c <= 160))
whatcode = SJIS; /* Shift-JIS detected. */
else if (c >= 253 && c <= 254)
whatcode = EUC; /* EUC-JP detected. */
else if (c >= 161 && c <= 252)
whatcode = EUCORSJIS; /* Ambiguous (Shift-JIS or EUC-JP). */
}
}
}
return whatcode; /* Return the detected code. */
}
Appendix C provides Perl code for a much more exible way to automatically detect the
encoding of CJKV text les, not only those for Japanese. at Perl code shows how power-
ful regular expressions can be when used in specic contexts.
Half- to Full-Width Katakana Conversion—in Java
It sometimes is necessary to convert half-width katakana to their full-width counterparts.
is is most useful as a lter to ensure that no half-width katakana characters are included
within email messages. It is also useful when you need to move les from one platform to
another and the new platform does not support half-width katakana characters. Example
usage of this Java method is as follows:
String half = "\uFF76\uFF9E";
String full = KatakanaFilter.halfToFullWidthKatakana(half);
ere is no simple conversion algorithm that you can use to accomplish this ...