
742
|
Appendix C: Perl Code Examples
sub han2zen {
my ($hkana) = @_;
if ($hkana =~ /^$euc\xB3$euc\xDE/o) { # Special "u + dakuten" case
if ($euc) {
$hkana =~ s/$euc\xB3$euc\xDE/\xA5\xF4/go;
} else {
$hkana =~ s/\xB3\xDE/\x83\x94/g;
}
} elsif ($hkana =~ /^${euc}[\xB6-\xC4\xCA-\xCE]${euc}[\xDE\xDF]/o) {
$prefix = $kana_one; # First byte for katakana
if ($hkana =~ /^${euc}[\xCA-\xCE]$euc\xDF/o) {
$suffix = 2; # Increment value for handakuten
} else {
$suffix = 1; # Increment value for dakuten
}
$hkana =~ s/$euc([\xB6-\xC4\xCA-\xCE])${euc}[\xDE\xDF]/
pack("n",unpack("n","$prefix$char_hash{$1}") +
$suffix)/egox;
} else {
if ($hkana =~ /^${euc}[\xA0-\xA5\xB0\xDE\xDF]/o) {
$prefix = $symbol_one; # First byte for symbol
} else {
$prefix = $kana_one; # First byte for katakana
}
$hkana =~ s/$euc([\xA0-\xDF])/$prefix$char_hash{$1}/go;
}
return $hkana;
}
Korean Code Conversion
Although this section does not include any complete Perl programs, the most dicult al-
gorithms for handling Korean encodings are included as workable subroutines that can be
used in Perl programs. e main focus of this section is Johab encoding, which requires
an algorithm for handling conversion to and from ISO-2022-KR or EUC-KR encodings,
which is strikingly similar to that used for handling conversion to and from Shi-JIS en-
coding. Because handling Johab encoding also requires mapping tables (for handling the
2,350 ...