In my first semester of Spanish class in high school, I went to look up an unfamiliar word, “chaleco,” in a Spanish-English dictionary. I looked under “C”, and found that the dictionary went right from “cetro” to “cía”. Someone had expurgated all the “ch” words! I had a brief nightmarish vision of a world without chorizo, chimichangas, chicharrones, or churros. After some frantic page-turning, I discovered that the “ch” words were in a separate section, “Ch”, between “C” and “D”. I asked the teacher about this, and he explained that it was normal practice for Spanish alphabetical order to consider “Ch” a letter after “C”. But it seemed ludicrous to me—two letters that counted as one. “How pointlessly complicated!” I thought. “Why not just keep it simple, A to Z, like normal? Like English.”
I later learned that every language has its own particular idea of what “alphabetical order” means; the fact that English’s conception of it seems so “normal” is partly because English doesn’t use any accents and partly because of accidents of history.
But many other languages use accented characters that have to be sorted with the 26
letters of the “normal” A-through-Z alphabet. And with other languages,
some combinations of characters, like the “ch” in Spanish, count as
letters on their own. But in almost every case, if you want to sort
according to the conventions of a particular language, the default
behavior of Perl’s
sort won’t sort that way. ...