O'Reilly logo

Java I/O by Elliotte Rusty Harold

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Displaying Unicode Text

Although internally Java can handle full Unicode data (it’s just numbers, after all), not all Java environments can display all Unicode characters. In fact, I’ll go so far as to say none of the current Java environments, whether standalone virtual machines or web browsers, can display all Unicode characters.

Unicode is divided into blocks. For example, characters through 127 are the Basic Latin block and contain ASCII. Characters 128 through 255 are the Latin Extended-A block and contain the upper 128 characters of the Latin-1 character set. Characters 9984 through 10,175 are the Dingbats block and contain the characters in the popular Zapf Dingbats font. Characters 19,968 through 40,959 are the unified Chinese-Japanese-Korean ideograph block. Each block represents a script or a subset of a script. As a rule of thumb, most runtime environments can display only some of these blocks. Occasionally, a particular runtime may be able to display some characters from a block but not others. For instance, most Macintoshes can display the entire Latin Extended-A block except for the Icelandic characters þ, Þ, Ý, Ð, and ð .

The biggest problem is the lack of fonts. Few computers have fonts for all the scripts Java supports. Even computers that possess the necessary fonts can’t install a lot of them because of their size. A normal, 8-bit outline font ranges from about 30-60K. A Unicode font that omits the Han ideographs will be about 10 times that size. And a full Unicode ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required