APPENDIX A Tech Corner Details
This Tech Corner section is for readers who are interested in getting more details regarding some linear algebra techniques commonly used in text analysis to reduce document-by-terms matrices and for information retrieval and computer vision. If you are not interested to learn more about these linear algebra techniques feel free to give this entire appendix section a skim. This chapter provides some useful details about linear algebra techniques we recalled throughout the book.
The three powerful techniques we will discuss here for matrices factorization and reduction include:
- Singular value decomposition (SVD)
- Principal component analysis (PCA)
- QR factorization
For each technique we will provide a description of the technique along with a step-by-step example on how to apply the technique throughout an example.
SVD and QR are both matrix decomposition techniques. SVD and QR help to produce a reduced rank approximation of the document matrix, as we need to identify the dependence between rows (documents) and columns (terms).
To easily navigate through some technical details of this section, readers require basic knowledge of linear algebra and matrix calculations are prerequisites, including:
- Space point
- Scalars
- Vectors
- Matrices
- Matrix reduction, matrix decomposition
- Matrix diagonalization
- Inverse matrices
- Characteristic equations
- Eigen-decomposition, eigenvalues, and eigenvectors
- Orthogonal and orthonormal vectors
- Gram–Schmidt orthonormalization ...
Get Unstructured Data Analytics now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.