# Appendix B. Mathematical Formulas

Throughout the book I have introduced a number of mathematical concepts. This appendix covers selected concepts and gives a description, relevant formulas, and code for each of them.

# Euclidean Distance

Euclidean distance finds the distance between two points in multidimensional space, which is the kind of distance you measure with a ruler. If the points are written as (p1, p2, p3, p4, ...) and (q1, q2, q3, q4, ...), then the formula for Euclidean distance can be expressed as shown in Figure B-1. Figure B-1. Euclidean distance

A clear implementation of this formula is shown here:

```def euclidean(p,q):
sumSq=0.0

# add up the squared differences
for i in range(len(p)):
sumSq+=(p[i]-q[i])**2

# take the square root
return (sumSq**0.5)```

Euclidean distance is used in several places in this book to determine how similar two items are.

# Pearson Correlation Coefficient

The Pearson correlation coefficient is a measure of how highly correlated two variables are. It is a value between 1 and −1, where 1 indicates that the variables are perfectly correlated, 0 indicates no correlation, and −1 means they are perfectly inversely correlated.

Figure B-2 shows the Pearson correlation coefficient. Figure B-2. Pearson correlation coefficient

This can be implemented with the following code: ...

Get Programming Collective Intelligence now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.