Chapter 3. Names, Names, Names
What can I say? Names are hard. Is it James or Jim? Spell-checking is impossible: people name their kids anything. Add in cross-cultural differences, and it becomes very hard to do much with names; but we must try! The rest of this chapter is going to assume you’re dealing with some system where your customer records are stored with fields similar to these if dealing with a human.
What’s in a Name?
You’ve had to fill out forms with your name since kindergarten. You know the drill on how they are supposed to work. The first three are the most common:
- Last name
-
Or family name or surname. Maybe you have fields for matronymics and patronymics, too.
- First name
-
Or given name. May be optional.
- Middle name
-
Or middle initial or middle names. Optional.
- Nickname(s)
-
Optional.
- Suffix
-
Optional.
- Titles and honorifics
-
Optional, and you’ll learn why we’ll ignore them.
- Full name
-
Often synthesized from the others, but woe to you if your incoming data only has this; we will talk about it at the end!
Or if dealing with businesses, simply this:
- Company name
-
While we commonly tear apart human names into their constituent parts, rarely are entity names held in more than one field. No matter the structure of the entity—corporation, partnership, trust, whatever—we jam it into one field. Except sometimes there is another field that looks like company name.
- DBA
-
“Doing Business As,” often used by an individual who hasn’t established a more formal entity ...
Get Fuzzy Data Matching with SQL now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.