Chapter 8

The Nuts and Bolts of Entity Resolution


This chapter goes into detail about the design considerations surrounding the entity resolution and entity identity information management processes that support the CSRUD life cycle.


Deterministic Matching; Probabilistic Matching; Attribute-based Cluster Project; Record-based Cluster Projection; One-Pass Algorithm; R-Swoosh Algorithm

The ER Checklist

Even in its most basic form, entity resolution (ER) has many moving parts that must be fit together correctly in order to obtain accurate and consistent results. The functions and features that are assembled to support the different phases of the CSRUD Life Cycle are called EIIM configurations. The focus of this chapter is on the configurations ...

Get Entity Information Life Cycle for Big Data now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.