O'Reilly logo

Unicode Demystified by Richard Gillam

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

The Basics of Language-Sensitive String Comparison

The first thing to remember is that you can't simply rely on comparison of the numeric code point values when you're comparing two strings. Unless the strings that may be compared conform to a very tightly restricted grammar, this approach will always give you the wrong answer. (The one exception occurs when the ordering and equivalences implied by the comparison routine will have no user-visible effects, but even then you must worry about some wrinkles—see “Language-Insensitive String Comparison” later in this chapter.)

This isn't just a Unicode issue. Any binary comparison will give the wrong answers with most encodings. In fact, for every encoding standard, it's probably possible to come up ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required