O'Reilly logo

Effective XML: 50 Specific Ways to Improve Your XML by Elliotte Rusty Harold

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Sorting

Many CS101 textbooks demonstrate sorting on strings by using code point order. Unfortunately this does not work in the real world, even in ASCII, much less in Unicode. Most obviously, real sorts (such as that found in the index in the back of this book) sort capital letters identically to their lowercase equivalents. Lichenstein should appear after language, not before it as it does when ordered by code points. Less obviously, the punctuation marks generally appear before all letters whether they're # (ASCII code point 35), [ (ASCII code point 91), or ~ (ASCII code point 126). And of course sorting is language dependent. While converting all characters to upper case and lexically ordering the resulting strings may give passable results ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required