3

Principal Component Analysis

A well-known algorithm to extract features from high-dimensional data for consumption in machine learning (ML) models is Principal Component Analysis (PCA). In mathematical terms, dimension is the minimum number of coordinates required to specify a vector in space. A lot of computational power is needed to find the distance between two vectors in high-dimensional space and in such cases, dimension is considered a curse. An increase in dimension will result in high performance of the algorithm only to a certain extent and will drop beyond that. This is the curse of dimensionality, as shown in Figure 3.1, which impedes the achievement of efficiency for most ML algorithms. The variable columns or features in data ...

Get A Handbook of Mathematical Models with Python now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.