Book description

Companies are scrambling to integrate AI into their systems and operations. But to build truly successful solutions, you need a firm grasp of the underlying mathematics. This accessible guide walks you through the math necessary to thrive in the AI field such as focusing on real-world applications rather than dense academic theory.

Engineers, data scientists, and students alike will examine mathematical topics critical for AI--including regression, neural networks, optimization, backpropagation, convolution, Markov chains, and more--through popular applications such as computer vision, natural language processing, and automated systems. And supplementary Jupyter notebooks shed light on examples with Python code and visualizations. Whether you're just beginning your career or have years of experience, this book gives you the foundation necessary to dive deeper in the field.

• Understand the underlying mathematics powering AI systems, including generative adversarial networks, random graphs, large random matrices, mathematical logic, optimal control, and more
• Learn how to adapt mathematical methods to different applications from completely different fields
• Gain the mathematical fluency to interpret and explain how AI systems arrive at their decisions

Publisher resources

View/Submit Errata

1. Preface
2. 1. Why Learn the Mathematics of AI?
3. 2. Data, Data, Data
1. Data for AI
2. Real Data Versus Simulated Data
3. Mathematical Models: Linear Versus Nonlinear
4. An Example of Real Data
5. An Example of Simulated Data
6. Mathematical Models: Simulations and AI
7. Where Do We Get Our Data From?
8. The Vocabulary of Data Distributions, Probability, and Statistics
9. Continuous Distributions Versus Discrete Distributions (Density Versus Mass)
10. The Power of the Joint Probability Density Function
11. Distribution of Data: The Uniform Distribution
12. Distribution of Data: The Bell-Shaped Normal (Gaussian) Distribution
13. Distribution of Data: Other Important and Commonly Used Distributions
14. The Various Uses of the Word “Distribution”
15. A/B Testing
4. 3. Fitting Functions to Data
1. Traditional and Very Useful Machine Learning Models
2. Numerical Solutions Versus Analytical Solutions
3. Regression: Predict a Numerical Value
4. Logistic Regression: Classify into Two Classes
5. Softmax Regression: Classify into Multiple Classes
6. Incorporating These Models into the Last Layer of a Neural Network
7. Other Popular Machine Learning Techniques and Ensembles of Techniques
8. Performance Measures for Classification Models
5. 4. Optimization for Neural Networks
1. The Brain Cortex and Artificial Neural Networks
2. Training Function: Fully Connected, or Dense, Feed Forward Neural Networks
3. Loss Functions
4. Optimization
5. Regularization Techniques
6. Hyperparameter Examples That Appear in Machine Learning
7. Chain Rule and Backpropagation: Calculating ∇ L ( ω → i )
8. Assessing the Significance of the Input Data Features
6. 5. Convolutional Neural Networks and Computer Vision
1. Convolution and Cross-Correlation
2. Convolution from a Systems Design Perspective
3. Convolution and One-Dimensional Discrete Signals
4. Convolution and Two-Dimensional Discrete Signals
5. Linear Algebra Notation
6. Pooling
7. A Convolutional Neural Network for Image Classification
7. 6. Singular Value Decomposition: Image Processing, Natural Language Processing, and Social Media
1. Matrix Factorization
2. Diagonal Matrices
3. Matrices as Linear Transformations Acting on Space
4. Three Ways to Multiply Matrices
5. The Big Picture
6. The Ingredients of the Singular Value Decomposition
7. Singular Value Decomposition Versus the Eigenvalue Decomposition
8. Computation of the Singular Value Decomposition
9. The Pseudoinverse
10. Applying the Singular Value Decomposition to Images
11. Principal Component Analysis and Dimension Reduction
12. Principal Component Analysis and Clustering
13. A Social Media Application
14. Latent Semantic Analysis
15. Randomized Singular Value Decomposition
8. 7. Natural Language and Finance AI: Vectorization and Time Series
1. Natural Language AI
2. Preparing Natural Language Data for Machine Processing
3. Statistical Models and the log Function
4. Zipf’s Law for Term Counts
5. Various Vector Representations for Natural Language Documents
6. Cosine Similarity
7. Natural Language Processing Applications
8. Transformers and Attention Models
9. Convolutional Neural Networks for Time Series Data
10. Recurrent Neural Networks for Time Series Data
11. An Example of Natural Language Data
12. Finance AI
9. 8. Probabilistic Generative Models
1. What Are Generative Models Useful For?
2. The Typical Mathematics of Generative Models
3. Shifting Our Brain from Deterministic Thinking to Probabilistic Thinking
4. Maximum Likelihood Estimation
5. Explicit and Implicit Density Models
6. Explicit Density-Tractable: Fully Visible Belief Networks
7. Explicit Density-Tractable: Change of Variables Nonlinear Independent Component Analysis
8. Explicit Density-Intractable: Variational Autoencoders Approximation via Variational Methods
9. Explicit Density-Intractable: Boltzman Machine Approximation via Markov Chain
10. Implicit Density-Markov Chain: Generative Stochastic Network
11. Implicit Density-Direct: Generative Adversarial Networks
12. Example: Machine Learning and Generative Networks for High Energy Physics
13. Other Generative Models
14. The Evolution of Generative Models
15. Probabilistic Language Modeling
10. 9. Graph Models
1. Graphs: Nodes, Edges, and Features for Each
2. Example: PageRank Algorithm
3. Inverting Matrices Using Graphs
4. Cayley Graphs of Groups: Pure Algebra and Parallel Computing
5. Message Passing Within a Graph
6. The Limitless Applications of Graphs
7. Random Walks on Graphs
8. Node Representation Learning
9. Tasks for Graph Neural Networks
10. Dynamic Graph Models
11. Bayesian Networks
12. Graph Diagrams for Probabilistic Causal Modeling
13. A Brief History of Graph Theory
14. Main Considerations in Graph Theory
15. Algorithms and Computational Aspects of Graphs
11. 10. Operations Research
1. No Free Lunch
2. Complexity Analysis and O() Notation
3. Optimization: The Heart of Operations Research
5. Optimization on Networks
6. The n-Queens Problem
7. Linear Optimization
8. Game Theory and Multiagents
9. Queuing
10. Inventory
11. Machine Learning for Operations Research
12. Hamilton-Jacobi-Bellman Equation
13. Operations Research for AI
12. 11. Probability
1. Where Did Probability Appear in This Book?
2. What More Do We Need to Know That Is Essential for AI?
3. Causal Modeling and the Do Calculus
5. Large Random Matrices
6. Stochastic Processes
7. Markov Decision Processes and Reinforcement Learning
8. Theoretical and Rigorous Grounds
13. 12. Mathematical Logic
1. Various Logic Frameworks
2. Propositional Logic
3. First-Order Logic
4. Probabilistic Logic
5. Fuzzy Logic
6. Temporal Logic
7. Comparison with Human Natural Language
8. Machines and Complex Mathematical Reasoning
14. 13. Artificial Intelligence and Partial Differential Equations
1. What Is a Partial Differential Equation?
2. Modeling with Differential Equations
3. Numerical Solutions Are Very Valuable
4. Some Statistical Mechanics: The Wonderful Master Equation
5. Solutions as Expectations of Underlying Random Processes
6. Transforming the PDE
7. Solution Operators
8. AI for PDEs
9. Hamilton-Jacobi-Bellman PDE for Dynamic Programming
10. PDEs for AI?
11. Other Considerations in Partial Differential Equations
15. 14. Artificial Intelligence, Ethics, Mathematics, Law, and Policy
1. Good AI
2. Policy Matters
3. What Could Go Wrong?
4. How to Fix It?
5. Distinguishing Bias from Discrimination
6. The Hype
7. Final Thoughts
16. Index