November 2017
Beginner to intermediate
366 pages
7h 59m
English
Let us begin with the feature generation R code:
library(RecordLinkage, quietly = TRUE)###### Quick look at our data #########data(RLdata500)str(RLdata500)head(RLdata500)#### Feature generation #############rec.pairs <- compare.dedup(RLdata500 ,blockfld = list(1, 5:7) ,strcmp = c(2,3,4) ,strcmpfun = levenshteinSim)summary(rec.pairs)matches <- rec.pairs$pairsmatches[c(1:3, 1203:1204), ]RLdata500[1,]RLdata500[174,]# String featuresrec.pairs.matches <- compare.dedup(RLdata500 ,blockfld = list(1, 5:7) ,strcmp = c(2,3,4) ,strcmpfun = levenshteinSim)# Not specifying the fields for string comparisionrec.pairs.matches <- compare.dedup(RLdata500 ,blockfld = list(1, 5:7) ,strcmp = TRUE ,strcmpfun = levenshteinSim)head(rec.pairs.matches$pairs) ...
Read now
Unlock full access