I'm going to start by doing this the hard way. NumPy does have a method to just compute the covariance for you, and we'll talk about that later, but for now I want to show that you can actually do this from first principles:
%matplotlib inline import numpy as np from pylab import * def de_mean(x): xmean = mean(x) return [xi - xmean for xi in x] def covariance(x, y): n = len(x) return dot(de_mean(x), de_mean(y)) / (n-1)
Covariance, again, is defined as the dot product, which is a measure of the angle between two vectors, of a vector of the deviations from the mean for a given set of data and the deviations from the mean for another given set of data for the same data's data points. We then divide that ...