November 2017
Beginner to intermediate
366 pages
7h 59m
English
The Jaccard index measures the similarity between two sets, and is a ratio of the size of the intersection and the size of the union of the participating sets. Here we have only have two elements, one for publisher and one for category, so our union is 2. The numerator, by adding the two Boolean variable, we get the intersection.
Finally, we also calculate the absolute difference (Manhattan distance) in the polarity values between the articles in the search results and our search article. We do a min/max normalization of the difference score as follows:
match.refined$polaritydiff <- abs(target.polarity - match.refined$polarity$sentiment)range01 <- function(x){(x-min(x))/(max(x)-min(x))}match.refined$polaritydiff ...Read now
Unlock full access