September 2013
Intermediate to advanced
114 pages
2h 51m
English
Reconciliation works great for those fields in your dataset that contain single terms, such as names of people, countries, or works of art. However, if your column contains running text, then reconciliation cannot help you, since it can only search for single terms in the datasets it uses. Fortunately, another technique called named-entity extraction can help us. An extraction algorithm searches texts for named entities which are text elements, such as names of persons, locations, values, organizations, and other widely-known things. In addition to just extracting the terms, most algorithms also try to perform disambiguation. For instance, if the algorithm finds Washington in a text, it will try to determine ...
Read now
Unlock full access