Machine Learning Q and AI

Book description

If you're ready to venture beyond introductory concepts and dig deeper into machine learning, deep learning, and AI, the question-and-answer format of Machine Learning Q and AI will make things fast and easy for you, without a lot of mucking about.

Born out of questions often fielded by author Sebastian Raschka, the direct, no-nonsense approach of this book makes advanced topics more accessible and genuinely engaging. Each brief, self-contained chapter journeys through a fundamental question in AI, unraveling it with clear explanations, diagrams, and hands-on exercises.

WHAT'S INSIDE:

FOCUSED CHAPTERS: Key questions in AI are answered concisely, and complex ideas are broken down into easily digestible parts.

WIDE RANGE OF TOPICS: Raschka covers topics ranging from neural network architectures and model evaluation to computer vision and natural language processing.

PRACTICAL APPLICATIONS: Learn techniques for enhancing model performance, fine-tuning large models, and more.

You'll also explore how to:

  • Manage the various sources of randomness in neural network training
  • Differentiate between encoder and decoder architectures in large language models
  • Reduce overfitting through data and model modifications
  • Construct confidence intervals for classifiers and optimize models with limited labeled data
  • Choose between different multi-GPU training paradigms and different types of generative AI models
  • Understand performance metrics for natural language processing
  • Make sense of the inductive biases in vision transformers

If you've been on the hunt for the perfect resource to elevate your understanding of machine learning, Machine Learning Q and AI will make it easy for you to painlessly advance your knowledge beyond the basics.

Publisher resources

View/Submit Errata

Table of contents

  1. Cover Page
  2. Title Page
  3. Copyright Page
  4. Dedication Page
  5. About the Author
  6. About the Technical Reviewer
  7. BRIEF CONTENTS
  8. CONTENTS IN DETAIL
  9. FOREWORD
  10. ACKNOWLEDGMENTS
  11. INTRODUCTION
    1. Who Is This Book For?
    2. What Will You Get Out of This Book?
    3. How to Read This Book
    4. Online Resources
  12. PART I: NEURAL NETWORKS AND DEEP LEARNING
  13. 1. EMBEDDINGS, LATENT SPACE, AND REPRESENTATIONS
    1. Embeddings
    2. Latent Space
    3. Representation
    4. Exercises
    5. References
  14. 2. SELF-SUPERVISED LEARNING
    1. Self-Supervised Learning vs. Transfer Learning
    2. Leveraging Unlabeled Data
    3. Self-Prediction and Contrastive Self-Supervised Learning
    4. Exercises
    5. References
  15. 3. FEW-SHOT LEARNING
    1. Datasets and Terminology
    2. Exercises
  16. 4. THE LOTTERY TICKET HYPOTHESIS
    1. The Lottery Ticket Training Procedure
    2. Practical Implications and Limitations
    3. Exercises
    4. References
  17. 5. REDUCING OVERFITTING WITH DATA
    1. Common Methods
      1. Collecting More Data
      2. Data Augmentation
      3. Pretraining
    2. Other Methods
    3. Exercises
    4. References
  18. 6. REDUCING OVERFITTING WITH MODEL MODIFICATIONS
    1. Common Methods
      1. Regularization
      2. Smaller Models
      3. Caveats with Smaller Models
      4. Ensemble Methods
    2. Other Methods
    3. Choosing a Regularization Technique
    4. Exercises
    5. References
  19. 7. MULTI-GPU TRAINING PARADIGMS
    1. The Training Paradigms
      1. Model Parallelism
      2. Data Parallelism
      3. Tensor Parallelism
      4. Pipeline Parallelism
      5. Sequence Parallelism
    2. Recommendations
    3. Exercises
    4. References
  20. 8. THE SUCCESS OF TRANSFORMERS
    1. The Attention Mechanism
    2. Pretraining via Self-Supervised Learning
    3. Large Numbers of Parameters
    4. Easy Parallelization
    5. Exercises
    6. References
  21. 9. GENERATIVE AI MODELS
    1. Generative vs. Discriminative Modeling
    2. Types of Deep Generative Models
      1. Energy-Based Models
      2. Variational Autoencoders
      3. Generative Adversarial Networks
      4. Flow-Based Models
      5. Autoregressive Models
      6. Diffusion Models
      7. Consistency Models
    3. Recommendations
    4. Exercises
    5. References
  22. 10. SOURCES OF RANDOMNESS
    1. Model Weight Initialization
    2. Dataset Sampling and Shuffling
    3. Nondeterministic Algorithms
    4. Different Runtime Algorithms
    5. Hardware and Drivers
    6. Randomness and Generative AI
    7. Exercises
    8. References
  23. PART II: COMPUTER VISION
  24. 11. CALCULATING THE NUMBER OF PARAMETERS
    1. How to Find Parameter Counts
      1. Convolutional Layers
      2. Fully Connected Layers
    2. Practical Applications
    3. Exercises
  25. 12. FULLY CONNECTED AND CONVOLUTIONAL LAYERS
    1. When the Kernel and Input Sizes Are Equal
    2. When the Kernel Size Is 1
    3. Recommendations
    4. Exercises
  26. 13. LARGE TRAINING SETS FOR VISION TRANSFORMERS
    1. Inductive Biases in CNNs
    2. ViTs Can Outperform CNNs
    3. Inductive Biases in ViTs
    4. Recommendations
    5. Exercises
    6. References
  27. PART III: NATURAL LANGUAGE PROCESSING
  28. 14. THE DISTRIBUTIONAL HYPOTHESIS
    1. Word2vec, BERT, and GPT
    2. Does the Hypothesis Hold?
    3. Exercises
    4. References
  29. 15. DATA AUGMENTATION FOR TEXT
    1. Synonym Replacement
    2. Word Deletion
    3. Word Position Swapping
    4. Sentence Shuffling
    5. Noise Injection
    6. Back Translation
    7. Synthetic Data
    8. Recommendations
    9. Exercises
    10. References
  30. 16. SELF-ATTENTION
    1. Attention in RNNs
    2. The Self-Attention Mechanism
    3. Exercises
    4. References
  31. 17. ENCODER- AND DECODER-STYLE TRANSFORMERS
    1. The Original Transformer
      1. Encoders
      2. Decoders
    2. Encoder-Decoder Hybrids
    3. Terminology
    4. Contemporary Transformer Models
    5. Exercises
    6. References
  32. 18. USING AND FINE-TUNING PRETRAINED TRANSFORMERS
    1. Using Transformers for Classification Tasks
    2. In-Context Learning, Indexing, and Prompt Tuning
    3. Parameter-Efficient Fine-Tuning
    4. Reinforcement Learning with Human Feedback
    5. Adapting Pretrained Language Models
    6. Exercises
    7. References
  33. 19. EVALUATING GENERATIVE LARGE LANGUAGE MODELS
    1. Evaluation Metrics for LLMs
      1. Perplexity
      2. BLEU Score
      3. ROUGE Score
      4. BERTScore
    2. Surrogate Metrics
    3. Exercises
    4. References
  34. PART IV: PRODUCTION AND DEPLOYMENT
  35. 20. STATELESS AND STATEFUL TRAINING
    1. Stateless (Re)training
    2. Stateful Training
    3. Exercises
  36. 21. DATA-CENTRIC AI
    1. Data-Centric vs. Model-Centric AI
    2. Recommendations
    3. Exercises
    4. References
  37. 22. SPEEDING UP INFERENCE
    1. Parallelization
    2. Vectorization
    3. Loop Tiling
    4. Operator Fusion
    5. Quantization
    6. Exercises
    7. References
  38. 23. DATA DISTRIBUTION SHIFTS
    1. Covariate Shift
    2. Label Shift
    3. Concept Drift
    4. Domain Shift
    5. Types of Data Distribution Shifts
    6. Exercises
    7. References
  39. PART V: PREDICTIVE PERFORMANCE AND MODEL EVALUATION
  40. 24. POISSON AND ORDINAL REGRESSION
    1. Exercises
  41. 25. CONFIDENCE INTERVALS
    1. Defining Confidence Intervals
    2. The Methods
      1. Method 1: Normal Approximation Intervals
      2. Method 2: Bootstrapping Training Sets
      3. Method 3: Bootstrapping Test Set Predictions
      4. Method 4: Retraining Models with Different Random Seeds
    3. Recommendations
    4. Exercises
    5. References
  42. 26. CONFIDENCE INTERVALS VS. CONFORMAL PREDICTIONS
    1. Confidence Intervals and Prediction Intervals
    2. Prediction Intervals and Conformal Predictions
    3. Prediction Regions, Intervals, and Sets
    4. Computing Conformal Predictions
    5. A Conformal Prediction Example
    6. The Benefits of Conformal Predictions
    7. Recommendations
    8. Exercises
    9. References
  43. 27. PROPER METRICS
    1. The Criteria
    2. The Mean Squared Error
    3. The Cross-Entropy Loss
    4. Exercises
  44. 28. THE K IN K-FOLD CROSS-VALIDATION
    1. Trade-offs in Selecting Values for k
    2. Determining Appropriate Values for k
    3. Exercises
    4. References
  45. 29. TRAINING AND TEST SET DISCORDANCE
    1. Exercises
  46. 30. LIMITED LABELED DATA
    1. Improving Model Performance with Limited Labeled Data
      1. Labeling More Data
      2. Bootstrapping the Data
      3. Transfer Learning
      4. Self-Supervised Learning
      5. Active Learning
      6. Few-Shot Learning
      7. Meta-Learning
      8. Weakly Supervised Learning
      9. Semi-Supervised Learning
      10. Self-Training
      11. Multi-Task Learning
      12. Multimodal Learning
      13. Inductive Biases
    2. Recommendations
    3. Exercises
    4. References
  47. AFTERWORD
  48. APPENDIX: ANSWERS TO THE EXERCISES
    1. Chapter 1
    2. Chapter 2
    3. Chapter 3
    4. Chapter 4
    5. Chapter 5
    6. Chapter 6
    7. Chapter 7
    8. Chapter 8
    9. Chapter 9
    10. Chapter 10
    11. Chapter 11
    12. Chapter 12
    13. Chapter 13
    14. Chapter 14
    15. Chapter 15
    16. Chapter 16
    17. Chapter 17
    18. Chapter 18
    19. Chapter 19
    20. Chapter 20
    21. Chapter 21
    22. Chapter 22
    23. Chapter 23
    24. Chapter 24
    25. Chapter 25
    26. Chapter 26
    27. Chapter 27
    28. Chapter 28
    29. Chapter 29
    30. Chapter 30
  49. INDEX

Product information

  • Title: Machine Learning Q and AI
  • Author(s): Sebastian Raschka
  • Release date: April 2024
  • Publisher(s): No Starch Press
  • ISBN: 9781718503762