Computing the Pearson correlation score
The Euclidean distance score is a good metric, but it has some shortcomings. Hence, Pearson correlation score is frequently used in recommendation engines. Let's see how to compute it.
How to do it…
- Create a new Python file, and import the following packages:
import json import numpy as np
- We will define a function to compute the Pearson correlation score between two users in the database. Our first step is to confirm that these users exist in the database:
# Returns the Pearson correlation score between user1 and user2 def pearson_score(dataset, user1, user2): if user1 not in dataset: raise TypeError('User ' + user1 + ' not present in the dataset') if user2 not in dataset: raise TypeError('User ' + user2 + ' ...
Get Python: Real World Machine Learning now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.