October 2013
Intermediate to advanced
368 pages
9h 20m
English
Searching is a common need in many applications. An effective search should find matches even if the user misspells words. Folks misspell my name in endless ways: Langer, Lang, Langur, Lange, and Lutefisk, to name a few. I’d prefer they find me regardless.
In this chapter, we will test-drive a Soundex class that can improve the search capability in an application. The long-standing Soundex algorithm encodes words into a letter plus three digits, mapping similarly sounding words to the same encoding. Here are the rules for Soundex, per Wikipedia:[4]
Retain the first letter. Drop all other occurrences of a, e, i, o, u, y, h, w.
Replace consonants with digits (after the first letter):
b, f, p, v: 1