Chapter 8

The Nuts and Bolts of Entity Resolution

Abstract

This chapter goes into detail about the design considerations surrounding the entity resolution and entity identity information management processes that support the CSRUD life cycle.

Keywords

Deterministic Matching; Probabilistic Matching; Attribute-based Cluster Project; Record-based Cluster Projection; One-Pass Algorithm; R-Swoosh Algorithm

The ER Checklist

Even in its most basic form, entity resolution (ER) has many moving parts that must be fit together correctly in order to obtain accurate and consistent results. The functions and features that are assembled to support the different phases of the CSRUD Life Cycle are called EIIM configurations. The focus of this chapter is on the configurations ...

Get Entity Information Life Cycle for Big Data now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.