November 2017
Beginner to intermediate
366 pages
7h 59m
English
Finally, there is the machine learning method for record linkage:
library(RecordLinkage, quietly = TRUE)data("RLdata500")# weight calculationrec.pairs <- compare.dedup(RLdata500 ,blockfld = list(1, 5:7) ,strcmp = c(2,3,4) ,strcmpfun = levenshteinSim)# Unsupervised classificationkmeans.model <- classifyUnsup(rec.pairs, method = "kmeans")summary(kmeans.model)final.results <- kmeans.model$pairsfinal.results$prediction <- kmeans.model$predictionhead(final.results)# Supervised Learning 1str(identity.RLdata500)rec.pairs <- compare.dedup(RLdata500 ,identity = identity.RLdata500 ,blockfld = list(1, 5:7))head(rec.pairs$pairs)train <- getMinimalTrain(rec.pairs)model <- trainSupv(train, method ="bagging")train.pred <- classifySupv(model, ...Read now
Unlock full access