Book description
Companies are scrambling to integrate AI into their systems and operations. But to build truly successful solutions, you need a firm grasp of the underlying mathematics. This accessible guide walks you through the math necessary to thrive in the AI field such as focusing on realworld applications rather than dense academic theory.
Engineers, data scientists, and students alike will examine mathematical topics critical for AIincluding regression, neural networks, optimization, backpropagation, convolution, Markov chains, and morethrough popular applications such as computer vision, natural language processing, and automated systems. And supplementary Jupyter notebooks shed light on examples with Python code and visualizations. Whether you're just beginning your career or have years of experience, this book gives you the foundation necessary to dive deeper in the field.
 Understand the underlying mathematics powering AI systems, including generative adversarial networks, random graphs, large random matrices, mathematical logic, optimal control, and more
 Learn how to adapt mathematical methods to different applications from completely different fields
 Gain the mathematical fluency to interpret and explain how AI systems arrive at their decisions
Publisher resources
Table of contents

Preface
 Why I Wrote This Book
 Who Is This Book For?
 Who Is This Book Not For?
 How Will the Math Be Presented in This Book?
 Infographic
 What Math Background Is Expected from You to Be Able to Read This Book?
 Overview of the Chapters
 My Favorite Books on AI
 Conventions Used in This Book
 Using Code Examples
 O’Reilly Online Learning
 How to Contact Us
 Acknowledgments
 1. Why Learn the Mathematics of AI?

2. Data, Data, Data
 Data for AI
 Real Data Versus Simulated Data
 Mathematical Models: Linear Versus Nonlinear
 An Example of Real Data
 An Example of Simulated Data
 Mathematical Models: Simulations and AI
 Where Do We Get Our Data From?

The Vocabulary of Data Distributions, Probability, and Statistics
 Random Variables
 Probability Distributions
 Marginal Probabilities
 The Uniform and the Normal Distributions
 Conditional Probabilities and Bayes’ Theorem
 Conditional Probabilities and Joint Distributions
 Prior Distribution, Posterior Distribution, and Likelihood Function
 Mixtures of Distributions
 Sums and Products of Random Variables
 Using Graphs to Represent Joint Probability Distributions
 Expectation, Mean, Variance, and Uncertainty
 Covariance and Correlation
 Markov Process
 Normalizing, Scaling, and/or Standardizing a Random Variable or Data Set
 Common Examples
 Continuous Distributions Versus Discrete Distributions (Density Versus Mass)
 The Power of the Joint Probability Density Function
 Distribution of Data: The Uniform Distribution
 Distribution of Data: The BellShaped Normal (Gaussian) Distribution
 Distribution of Data: Other Important and Commonly Used Distributions
 The Various Uses of the Word “Distribution”
 A/B Testing
 Summary and Looking Ahead

3. Fitting Functions to Data
 Traditional and Very Useful Machine Learning Models
 Numerical Solutions Versus Analytical Solutions
 Regression: Predict a Numerical Value
 Logistic Regression: Classify into Two Classes
 Softmax Regression: Classify into Multiple Classes
 Incorporating These Models into the Last Layer of a Neural Network
 Other Popular Machine Learning Techniques and Ensembles of Techniques
 Performance Measures for Classification Models
 Summary and Looking Ahead

4. Optimization for Neural Networks
 The Brain Cortex and Artificial Neural Networks
 Training Function: Fully Connected, or Dense, Feed Forward Neural Networks
 Loss Functions
 Optimization
 Regularization Techniques
 Hyperparameter Examples That Appear in Machine Learning
 Chain Rule and Backpropagation: Calculating ∇ L ( ω → i )
 Assessing the Significance of the Input Data Features
 Summary and Looking Ahead
 5. Convolutional Neural Networks and Computer Vision

6. Singular Value Decomposition: Image Processing, Natural Language Processing, and Social Media
 Matrix Factorization
 Diagonal Matrices

Matrices as Linear Transformations Acting on Space
 Action of A on the Right Singular Vectors
 Action of A on the Standard Unit Vectors and the Unit Square Determined by Them
 Action of A on the Unit Circle
 Breaking Down the CircletoEllipse Transformation According to the Singular Value Decomposition
 Rotation and Reflection Matrices
 Action of A on a General Vector x →
 Three Ways to Multiply Matrices
 The Big Picture
 The Ingredients of the Singular Value Decomposition
 Singular Value Decomposition Versus the Eigenvalue Decomposition
 Computation of the Singular Value Decomposition
 The Pseudoinverse
 Applying the Singular Value Decomposition to Images
 Principal Component Analysis and Dimension Reduction
 Principal Component Analysis and Clustering
 A Social Media Application
 Latent Semantic Analysis
 Randomized Singular Value Decomposition
 Summary and Looking Ahead

7. Natural Language and Finance AI: Vectorization and Time Series
 Natural Language AI
 Preparing Natural Language Data for Machine Processing
 Statistical Models and the log Function
 Zipf’s Law for Term Counts

Various Vector Representations for Natural Language Documents
 Term Frequency Vector Representation of a Document or Bag of Words
 Term FrequencyInverse Document Frequency Vector Representation of a Document
 Topic Vector Representation of a Document Determined by Latent Semantic Analysis
 Topic Vector Representation of a Document Determined by Latent Dirichlet Allocation
 Topic Vector Representation of a Document Determined by Latent Discriminant Analysis
 Meaning Vector Representations of Words and of Documents Determined by Neural Network Embeddings
 Cosine Similarity
 Natural Language Processing Applications
 Transformers and Attention Models
 Convolutional Neural Networks for Time Series Data
 Recurrent Neural Networks for Time Series Data
 An Example of Natural Language Data
 Finance AI
 Summary and Looking Ahead

8. Probabilistic Generative Models
 What Are Generative Models Useful For?
 The Typical Mathematics of Generative Models
 Shifting Our Brain from Deterministic Thinking to Probabilistic Thinking
 Maximum Likelihood Estimation
 Explicit and Implicit Density Models
 Explicit DensityTractable: Fully Visible Belief Networks
 Explicit DensityTractable: Change of Variables Nonlinear Independent Component Analysis
 Explicit DensityIntractable: Variational Autoencoders Approximation via Variational Methods
 Explicit DensityIntractable: Boltzman Machine Approximation via Markov Chain
 Implicit DensityMarkov Chain: Generative Stochastic Network
 Implicit DensityDirect: Generative Adversarial Networks
 Example: Machine Learning and Generative Networks for High Energy Physics
 Other Generative Models
 The Evolution of Generative Models
 Probabilistic Language Modeling
 Summary and Looking Ahead

9. Graph Models
 Graphs: Nodes, Edges, and Features for Each
 Example: PageRank Algorithm
 Inverting Matrices Using Graphs
 Cayley Graphs of Groups: Pure Algebra and Parallel Computing
 Message Passing Within a Graph

The Limitless Applications of Graphs
 Brain Networks
 Spread of Disease
 Spread of Information
 Detecting and Tracking Fake News Propagation
 WebScale Recommendation Systems
 Fighting Cancer
 Biochemical Graphs
 Molecular Graph Generation for Drug and Protein Structure Discovery
 Citation Networks
 Social Media Networks and Social Influence Prediction
 Sociological Structures
 Bayesian Networks
 Traffic Forecasting
 Logistics and Operations Research
 Language Models
 Graph Structure of the Web
 Automatically Analyzing Computer Programs
 Data Structures in Computer Science
 Load Balancing in Distributed Networks
 Artificial Neural Networks
 Random Walks on Graphs
 Node Representation Learning
 Tasks for Graph Neural Networks
 Dynamic Graph Models

Bayesian Networks
 A Bayesian Network Represents a Compactified Conditional Probability Table
 Making Predictions Using a Bayesian Network
 Bayesian Networks Are Belief Networks, Not Causal Networks
 Keep This in Mind About Bayesian Networks
 Chains, Forks, and Colliders
 Given a Data Set, How Do We Set Up a Bayesian Network for the Involved Variables?
 Graph Diagrams for Probabilistic Causal Modeling
 A Brief History of Graph Theory
 Main Considerations in Graph Theory
 Algorithms and Computational Aspects of Graphs
 Summary and Looking Ahead

10. Operations Research
 No Free Lunch
 Complexity Analysis and O() Notation
 Optimization: The Heart of Operations Research
 Thinking About Optimization
 Optimization on Networks
 The nQueens Problem
 Linear Optimization
 Game Theory and Multiagents
 Queuing
 Inventory
 Machine Learning for Operations Research
 HamiltonJacobiBellman Equation
 Operations Research for AI
 Summary and Looking Ahead

11. Probability
 Where Did Probability Appear in This Book?
 What More Do We Need to Know That Is Essential for AI?
 Causal Modeling and the Do Calculus
 Paradoxes and Diagram Interpretations
 Large Random Matrices
 Stochastic Processes
 Markov Decision Processes and Reinforcement Learning

Theoretical and Rigorous Grounds
 Which Events Have a Probability?
 Can We Talk About a Wider Range of Random Variables?
 A Probability Triple (Sample Space, Sigma Algebra, Probability Measure)
 Where Is the Difficulty?
 Random Variable, Expectation, and Integration
 Distribution of a Random Variable and the Change of Variable Theorem
 Next Steps in Rigorous Probability Theory
 The Universality Theorem for Neural Networks
 Summary and Looking Ahead
 12. Mathematical Logic

13. Artificial Intelligence and Partial Differential Equations
 What Is a Partial Differential Equation?
 Modeling with Differential Equations
 Numerical Solutions Are Very Valuable
 Some Statistical Mechanics: The Wonderful Master Equation
 Solutions as Expectations of Underlying Random Processes
 Transforming the PDE
 Solution Operators
 AI for PDEs
 HamiltonJacobiBellman PDE for Dynamic Programming
 PDEs for AI?
 Other Considerations in Partial Differential Equations
 Summary and Looking Ahead
 14. Artificial Intelligence, Ethics, Mathematics, Law, and Policy
 Index
 About the Author
Product information
 Title: Essential Math for AI
 Author(s):
 Release date: January 2023
 Publisher(s): O'Reilly Media, Inc.
 ISBN: 9781098107635
You might also like
book
Generative Deep Learning, 2nd Edition
Generative AI is the hottest topic in tech. This practical book teaches machine learning engineers and …
book
AI and Machine Learning for Coders
If you're looking to make a career move from programmer to AI specialist, this is the …
book
Deep Learning Illustrated: A Visual, Interactive Guide to Artificial Intelligence
“The authors’ clear visual style provides a comprehensive look at what’s currently possible with artificial neural …
book
Natural Language Processing with Transformers, Revised Edition
Since their introduction in 2017, transformers have quickly become the dominant architecture for achieving stateoftheart results …