Searching
The other major use for string comparison involves searching.[7] By now, it should be apparent that sequences of Unicode code points that aren't bit-for-bit identical should compare as equal in many situations and thus should be returned as a hit from a text searching routine. Not only do you have to deal with issues such as Unicode canonical and compatibility equivalents, but you also have to handle linguistically equivalent strings. For example, if you're searching for “cooperate,” you would probably want to find “coöperate” and “co-operate” as well. By now, you've undoubtedly figured out that the best strategy is not to match Unicode code points, but rather to convert both the search key and the string being searched to collation ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access