Redacting names with named entity recognition

The challenge for this section is to replace all human names with [REDACTED] in free text.

Let's imagine that you are a new engineer at the European Bank Co. In preparation for the General Data Processing Regulation (GDPR), the bank is scrubbing off names of their customers from all of their old records and special internal communications like email and memos. They ask you to do this.

The first way you can do this is to look up the names of your customers and match each of them against all of your emails. This can be painfully slow and error-prone. For example, let's say the bank has a customer named John D'Souza  you might simply refer to him as DSouza in an email, so an exact match for D'Souza ...

Get Natural Language Processing with Python Quick Start Guide now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.