February 2019
Intermediate to advanced
386 pages
9h 54m
English
One of the most common ways to reduce the dimensionality of a dataset is based on the analysis of the sample covariance matrix. In general, we know that the information content of a random variable is proportional to its variance. For example, given a multivariate Gaussian, the entropy, which is the mathematical expression that we employ to measure the information, is as follows:

In the previous formula, Σ is the covariance matrix. If we assume (without loss of generality) that Σ is diagonal, it's easy to understand that the entropy is larger (proportionally) than the variance of each single component,