Sorting Internationalized Strings
One big advantage you get with
String
s is that they are built (almost) from the
ground up to support internationalization. This means that the
Unicode character
set is the lingua franca in Java. Unfortunately, because Unicode uses
two-byte characters, many string libraries based on one-byte
characters that can be ported into Java do not work so well. Most
string-search optimizations use tables to assist string searches, but
the table size is related to the size of the character set. For
example, traditional Boyer-Moore string search takes much memory and
a long initialization phase to use with Unicode.
Get Java Performance Tuning now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.