Skip to Main Content
Perl Cookbook
book

Perl Cookbook

by Tom Christiansen, Nathan Torkington
August 1998
Intermediate to advanced content levelIntermediate to advanced
800 pages
39h 20m
English
O'Reilly Media, Inc.
Content preview from Perl Cookbook

Soundex Matching

Problem

You have two English surnames and want to know whether they sound somewhat similar, regardless of spelling. This would let you offer users a “fuzzy search” of names in a telephone book to catch “Smith” and “Smythe” and others within the set, such as “Smite” and “Smote.”

Solution

Use the standard Text::Soundex module:

use Text::Soundex;

 $CODE  = soundex($STRING);
 @CODES = soundex(@LIST);

Discussion

The soundex algorithm hashes words (particularly English surnames) into a small space using a simple model that approximates an English speaker’s pronunciation of the words. Roughly speaking, each word is reduced to a four character string. The first character is an uppercase letter; the remaining three are digits. By comparing the soundex values of two strings, we can guess whether they sound similar.

The following program prompts for a name and looks for similarly sounding names from the password file. This same approach works on any database with names, so you could key the database on the soundex values if you wanted to. Such a key wouldn’t be unique, of course.

use Text::Soundex; use User::pwent; print "Lookup user: "; chomp($user = <STDIN>); exit unless defined $user; $name_code = soundex($user); while ($uent = getpwent()) { ($firstname, $lastname) = $uent->gecos =~ /(\w+)[^,]*\b(\w+)/; if ($name_code eq soundex($uent->name) || $name_code eq soundex($lastname) || $name_code eq soundex($firstname) ) { printf "%s: %s %s\n", $uent->name, $firstname, $lastname; ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Perl in a Nutshell

Perl in a Nutshell

Nathan Patwardhan, Ellen Siever, Stephen Spainhour
Perl Best Practices

Perl Best Practices

Damian Conway
Mastering Perl

Mastering Perl

brian d foy
Perl Cookbook, 2nd Edition

Perl Cookbook, 2nd Edition

Tom Christiansen, Nathan Torkington

Publisher Resources

ISBN: 1565922433Supplemental ContentCatalog PageErrata