O'Reilly logo

R Data Analysis Cookbook - Second Edition by Kuntal Ganguly

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

How to do it...

We proceed as follows with the similarity measures:

  1. First, calculate the similarity between two items using Euclidean distance:
> x1 <- rnorm(30) 
> x2 <- rnorm(30) 
> Euclidean_dist = dist(rbind(x1,x2) ,method="euclidean") 
> Euclidean_dist 
 
         x1 
x2 6.427449
  1. Next, calculate the cosine similarity between two vectors. Load the lsa package:
> library(lsa) 
> vector1 = c( 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0 ) 
> vector2 = c( 0, 0, 1, 1, 1, 1, 1, 0, 1, 0, 0, 0 ) 
> cosine(vector1,vector2) 
 
          [,1] 
[1,] 0.2357023 
 
  1. Finally, calculate the Pearson correlation between two variables using the mtcars dataset:
> mtcars_data <- read.csv("mtcars.csv") > rownames(mtcars_data) <- mtcars_data$X > mtcars_data$X <- NULL > coeff <- cor(mtcars_data, method="pearson") ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required