O'Reilly logo

PHP 5 Power Programming by Derick Rethans, Stig Sæther Bakken, Andi Gutmans

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

9.6. Multi-Byte Strings and Character Sets

Not all languages use the same character set, not even in the western world. For example, the Š is only part of ISO-8859-2, not of ISO-8859-1. Because these character sets only have 8 bits to use, that only makes 256 different combinations. 8 bits is a problem for languages such as Chinese that have thousands of letters but 8 bits only support 256 characters. That's why the Chinese (and also other Asian scripts) have to use another encoding for their characters, such as BIG5 or GB2312. The Japanse use other encodings for their characters: EUC-JP, JIS, SJIS, and so on. All those different character sets are a problem to work with because some map the same character number to a different character (such ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required