Essential Math for AI

Book description

Companies are scrambling to integrate AI into their systems and operations. But to build truly successful solutions, you need a firm grasp of the underlying mathematics. This accessible guide walks you through the math necessary to thrive in the AI field such as focusing on real-world applications rather than dense academic theory.

Engineers, data scientists, and students alike will examine mathematical topics critical for AI--including regression, neural networks, optimization, backpropagation, convolution, Markov chains, and more--through popular applications such as computer vision, natural language processing, and automated systems. And supplementary Jupyter notebooks shed light on examples with Python code and visualizations. Whether you're just beginning your career or have years of experience, this book gives you the foundation necessary to dive deeper in the field.

  • Understand the underlying mathematics powering AI systems, including generative adversarial networks, random graphs, large random matrices, mathematical logic, optimal control, and more
  • Learn how to adapt mathematical methods to different applications from completely different fields
  • Gain the mathematical fluency to interpret and explain how AI systems arrive at their decisions

Publisher resources

View/Submit Errata

Table of contents

  1. 1. Why Learn the Mathematics of AI
    1. What Is AI?
    2. Why Is AI so Popular Now?
    3. What Is AI Able To Do?
      1. An AI Agent’s Specific Tasks
    4. What Are AI’s Limitations?
    5. What Happens When AI Systems Fail?
    6. Where Is AI Headed?
    7. Who Are Currently The Main Contributors To The AI Field?
    8. What Math Is Typically Involved In AI?
    9. Summary And Looking Ahead
  2. 2. Data, Data, Data
    1. Data for AI
    2. Real Data vs. Simulated Data
    3. Mathematical Models: Linear vs. Nonlinear
    4. An Example of Real Data
    5. An Example of Simulated Data
    6. Mathematical Models: Simulations and AI
    7. Where Do We Get our Data From?
    8. The Vocabulary of Data Distributions, Probability, and Statistics
    9. Continuous Distributions vs. Discrete Distributions (Density vs. Mass)
    10. The Power Of The Joint Probability Density Function
    11. Distribution of Data: The Uniform Distribution
    12. Distribution of Data: The Bell Shaped Normal (Gaussian) Distribution
    13. Distribution of Data: Other Important and Commonly Used Distributions
    14. The Various Uses Of The Word Distribution
    15. Summary And Looking Ahead
  3. 3. Fitting Functions to Data
    1. Traditional And Very Useful Machine Learning Models
    2. Numerical Solutions vs. Analytical Solutions
    3. Regression: Predict A Numerical Value
      1. Training Function
      2. Loss Function
      3. Optimization
    4. Logistic Regression: Classify Into Two Classes
      1. Training function
      2. Loss Function
      3. Optimization
    5. Softmax Regression: Classify Into Multiple Classes
      1. Training function
      2. Loss Function
      3. Optimization
    6. Incorporating The Above Models Into The Last Layer Of A Neural Network
    7. Other Popular Machine Learning Techniques and Ensembles of Techniques
      1. Support Vector Machines
      2. Decision Trees
      3. Random Forests
      4. k-means Clustering
    8. Performance Measures For Classification Models
    9. Summary and Looking Ahead
  4. 4. Optimization For Neural Networks
    1. The Brain Cortex And Artificial Neural Networks
    2. Training Function: Fully Connected, Or Dense, Feed Forward Neural Networks
      1. A Neural Network Is A Computational Graph Representation Of The Training Function
      2. Linearly Combine, Add Bias, Then Activate
      3. Common Activation Functions
      4. Universal Function Approximation
      5. Approximation Theory For Deep Learning
    3. Loss Functions
    4. Optimization
      1. Mathematics And The Mysterious Success of Neural Networks
      2. Gradient Descent ω → i+1 = ω → i - η ∇ L ( ω → i )
      3. Explaining The Role Of The Learning Rate Hyperparameter η
      4. Convex vs. Non-Convex Landscapes
      5. Stochastic Gradient Descent
      6. Initializing The Weights ω → 0 For The Optimization Process
    5. Regularization Techniques
      1. Dropout
      2. Early Stopping
      3. Batch Normalization Of Each Layer
      4. Control The Size Of The Weights By Penalizing Their Norm
      5. Penalizing The l 2 Norm vs Penalizing the l 1 Norm
      6. Explaining The Role Of The Regularization Hyper-parameter α
    6. Hyper-parameter Examples That Appear In Machine Learning
    7. Chain Rule And Back-Propagation: Calculating ∇ L ( ω → i )
    8. Assessing The Significance Of The Features Of The Input Data
    9. Summary And Looking Ahead
  5. 5. Convolutional Neural Networks and Computer Vision
    1. Convolution And Cross-Correlation
    2. Convolution From A System’s Design Perspective
      1. Convolution And Impulse Response For Linear And Translation Invariant Systems
    3. Convolution And One Dimensional Discrete Signals
    4. Convolution And Two Dimensional Discrete Signals
      1. Filtering Images
      2. Feature Maps
    5. Linear Algebra Notation
      1. The One Dimensional Case: Multiplication by a Teoplitz Matrix
    6. Pooling
    7. A Convolutional Neural Network For Image Classification
    8. Summary and Looking Ahead
  6. About the Author

Product information

  • Title: Essential Math for AI
  • Author(s): Hala Nelson
  • Release date: December 2022
  • Publisher(s): O'Reilly Media, Inc.
  • ISBN: 9781098107635