O'Reilly logo

Unicode Demystified by Richard Gillam

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Standard Compression Scheme for Unicode

One of the major reasons for resistance to Unicode when it first came out was the idea of text files taking up twice as much room as before to store the same amount of actual information. For languages such as Chinese and Japanese that were already using two bytes per character, this issue wasn't a problem. Nevertheless, the idea of using two bytes per character for the Latin alphabet was anathema to a lot of people.

The concern is certainly legitimate: The same document takes up twice as much space on a disk and twice as long to send over a communications link. A database column containing text takes up twice as much disk space. In an era of slow file downloads, for example, the idea of waiting twice as ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required