December 2017
Beginner to intermediate
500 pages
12h 10m
English
In addition, suppose that we have the same data as before and want to create a list of the states that appear in our dataset. Among the values, we have Hawaii, Hawai, and Howaii. We don't want the three values on our final list. We only want a single state: Hawaii. If we try to deduplicate the data with the Unique rows step, we will still have three values. The only solution is trying to fix the values with a fuzzy search algorithm, and only after that doing the deduplication. This doesn't differ much from the previous solution:
Read now
Unlock full access