At times, it is necessary to look for data that is similar. This could mean words or names that are spelled just a bit differently, or that are the same length, or that start with the same character and are the same length even though the rest of the characters are different. How does one go about coding a routine to find such values?
Matching items that are similar is as much an art as it is a programming discipline. There are many rules that can be implemented, so it is best to determine your exact needs or expectations of how the data might be similar, and then code appropriately.
Figure 6-23 shows a table containing similar names. This recipe will discuss a few methods to compare each of these with the name Johnson (which just happens to be the first name anyway).
Figure 6-23. A table of similar names
To demonstrate, we'll consider three matching approaches:
The first approach compares the lengths of the two strings and returns a percentage value indicating the closeness of the match. A result of
1 means the strings are exactly the same length; a lower result indicates that the record value is shorter, and a higher result indicates that it's longer.
The second approach returns a count of characters that match at the same position in each string, and the overall percentage of the match.
The third approach returns a
1 or a
0, respectively, ...