book

Practical Linear Algebra for Data Science

Name: Practical Linear Algebra for Data Science
Author: Mike X Cohen
ISBN: 9781098120610

by Mike X Cohen

September 2022

Beginner to intermediate

326 pages

9h 33m

English

O'Reilly Media, Inc.

Read now

Unlock full access

Preface
Conventions Used in This BookUsing Code ExamplesO’Reilly Online LearningHow to Contact UsAcknowledgments
1. Introduction
What Is Linear Algebra and Why Learn It?About This BookPrerequisitesMathAttitudeCodingMathematical Proofs Versus Intuition from CodingCode, Printed in the Book and Downloadable OnlineCode ExercisesHow to Use This Book (for Teachers and Self Learners)
2. Vectors, Part 1
Creating and Visualizing Vectors in NumPyGeometry of VectorsOperations on VectorsAdding Two VectorsGeometry of Vector Addition and SubtractionVector-Scalar MultiplicationScalar-Vector AdditionTransposeVector Broadcasting in PythonVector Magnitude and Unit VectorsThe Vector Dot ProductThe Dot Product Is DistributiveGeometry of the Dot ProductOther Vector MultiplicationsHadamard MultiplicationOuter ProductCross and Triple ProductsOrthogonal Vector DecompositionSummaryCode Exercises
3. Vectors, Part 2
Vector SetsLinear Weighted CombinationLinear IndependenceThe Math of Linear IndependenceIndependence and the Zeros VectorSubspace and SpanBasisDefinition of BasisSummaryCode Exercises
4. Vector Applications
Correlation and Cosine SimilarityTime Series Filtering and Feature Detectionk-Means ClusteringCode ExercisesCorrelation ExercisesFiltering and Feature Detection Exercisesk-Means Exercises
5. Matrices, Part 1
Creating and Visualizing Matrices in NumPyVisualizing, Indexing, and Slicing MatricesSpecial MatricesMatrix Math: Addition, Scalar Multiplication, Hadamard MultiplicationAddition and Subtraction“Shifting” a MatrixScalar and Hadamard MultiplicationsStandard Matrix MultiplicationRules for Matrix Multiplication ValidityMatrix MultiplicationMatrix-Vector MultiplicationMatrix Operations: TransposeDot and Outer Product NotationMatrix Operations: LIVE EVIL (Order of Operations)Symmetric MatricesCreating Symmetric Matrices from Nonsymmetric MatricesSummaryCode Exercises
6. Matrices, Part 2
Matrix NormsMatrix Trace and Frobenius NormMatrix Spaces (Column, Row, Nulls)Column SpaceRow SpaceNull SpacesRankRanks of Special MatricesRank of Added and Multiplied MatricesRank of Shifted MatricesTheory and PracticeRank ApplicationsIn the Column Space?Linear Independence of a Vector SetDeterminantComputing the DeterminantDeterminant with Linear DependenciesThe Characteristic PolynomialSummaryCode Exercises
7. Matrix Applications
Multivariate Data Covariance MatricesGeometric Transformations via Matrix-Vector MultiplicationImage Feature DetectionSummaryCode ExercisesCovariance and Correlation Matrices ExercisesGeometric Transformations ExercisesImage Feature Detection Exercises
8. Matrix Inverse
The Matrix InverseTypes of Inverses and Conditions for InvertibilityComputing the InverseInverse of a 2 × 2 MatrixInverse of a Diagonal MatrixInverting Any Square Full-Rank MatrixOne-Sided InversesThe Inverse Is UniqueMoore-Penrose PseudoinverseNumerical Stability of the InverseGeometric Interpretation of the InverseSummaryCode Exercises
9. Orthogonal Matrices and QR Decomposition
Orthogonal MatricesGram-SchmidtQR DecompositionSizes of Q and RQR and InversesSummaryCode Exercises

10. Row Reduction and LU Decomposition
Systems of EquationsConverting Equations into MatricesWorking with Matrix EquationsRow ReductionGaussian EliminationGauss-Jordan EliminationMatrix Inverse via Gauss-Jordan EliminationLU DecompositionRow Swaps via Permutation MatricesSummaryCode Exercises
11. General Linear Models and Least Squares
General Linear ModelsTerminologySetting Up a General Linear ModelSolving GLMsIs the Solution Exact?A Geometric Perspective on Least SquaresWhy Does Least Squares Work?GLM in a Simple ExampleLeast Squares via QRSummaryCode Exercises
12. Least Squares Applications
Predicting Bike Rentals Based on WeatherRegression Table Using statsmodelsMulticollinearityRegularizationPolynomial RegressionGrid Search to Find Model ParametersSummaryCode ExercisesBike Rental ExercisesMulticollinearity ExerciseRegularization ExercisePolynomial Regression ExerciseGrid Search Exercises
13. Eigendecomposition
Interpretations of Eigenvalues and EigenvectorsGeometryStatistics (Principal Components Analysis)Noise ReductionDimension Reduction (Data Compression)Finding EigenvaluesFinding EigenvectorsSign and Scale Indeterminacy of EigenvectorsDiagonalizing a Square MatrixThe Special Awesomeness of Symmetric MatricesOrthogonal EigenvectorsReal-Valued EigenvaluesEigendecomposition of Singular MatricesQuadratic Form, Definiteness, and EigenvaluesThe Quadratic Form of a MatrixDefiniteness 𝐀 T 𝐀 Is Positive (Semi)definiteGeneralized EigendecompositionSummaryCode Exercises
14. Singular Value Decomposition
The Big Picture of the SVDSingular Values and Matrix RankSVD in PythonSVD and Rank-1 “Layers” of a MatrixSVD from EIGSVD of 𝐀 T 𝐀 Converting Singular Values to Variance, ExplainedCondition NumberSVD and the MP PseudoinverseSummaryCode Exercises
15. Eigendecomposition and SVD Applications
PCA Using Eigendecomposition and SVDThe Math of PCAThe Steps to Perform a PCAPCA via SVDLinear Discriminant AnalysisLow-Rank Approximations via SVDSVD for DenoisingSummaryExercisesPCALinear Discriminant AnalysesSVD for Low-Rank ApproximationsSVD for Image Denoising
16. Python Tutorial
Why Python, and What Are the Alternatives?IDEs (Interactive Development Environments)Using Python Locally and OnlineWorking with Code Files in Google ColabVariablesData TypesIndexingFunctionsMethods as FunctionsWriting Your Own FunctionsLibrariesNumPyIndexing and Slicing in NumPyVisualizationTranslating Formulas to CodePrint Formatting and F-StringsControl FlowComparatorsIf StatementsFor LoopsNested Control StatementsMeasuring Computation TimeGetting Help and Learning MoreWhat to Do When Things Go AwrySummary
Index
About the Author

Content preview from Practical Linear Algebra for Data Science

Chapter 5. Matrices, Part 1

A matrix is a vector taken to the next level. Matrices are highly versatile mathematical objects. They can store sets of equations, geometric transformations, the positions of particles over time, financial records, and myriad other things. In data science, matrices are sometimes called data tables, in which rows correspond to observations (e.g., customers) and columns correspond to features (e.g., purchases).

This and the following two chapters will take your knowledge about linear algebra to the next level. Get a cup of coffee and put on your thinking cap. Your brain will be bigger by the end of the chapter.

Creating and Visualizing Matrices in NumPy

Depending on the context, matrices can be conceptualized as a set of column vectors stacked next to each other (e.g., a data table of observations-by-features), as a set of row vectors layered on top of each other (e.g., multisensor data in which each row is a time series from a different channel), or as an ordered collection of individual matrix elements (e.g., an image in which each matrix element encodes pixel intensity value).

Visualizing, Indexing, and Slicing Matrices

Small matrices can simply be printed out in full, like the following examples:

[\begin{matrix} 1 & 2 \\ π & 4 \\ 6 & 7 \end{matrix}], [\begin{matrix} - 6 & 1/3 \\ e^{4.3} & - 1.4 \\ 6/5 & 0 \end{matrix}]

But that’s not scalable, and matrices that you work with in practice can be large, perhaps containing billions of elements. Therefore, larger matrices ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

O’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.

Julian F.

Head of Cybersecurity

I wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.

Addison B.

Field Engineer

I’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.

Amir M.

Data Platform Tech Lead

I'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.

Mark W.

Embedded Software Engineer

Publisher Resources

ISBN: 9781098120603Errata Page

Cloud Computing

Data Engineering

Data Science

AI & ML

Programming Languages

Software Architecture

IT/Ops

Security

Design