Machine Learning with PyTorch and Scikit-Learn

Book description

This book of the bestselling and widely acclaimed Python Machine Learning series is a comprehensive guide to machine and deep learning using PyTorch s simple to code framework. Purchase of the print or Kindle book includes a free eBook in PDF format.

Key Features

  • Learn applied machine learning with a solid foundation in theory
  • Clear, intuitive explanations take you deep into the theory and practice of Python machine learning
  • Fully updated and expanded to cover PyTorch, transformers, XGBoost, graph neural networks, and best practices

Book Description

Machine Learning with PyTorch and Scikit-Learn is a comprehensive guide to machine learning and deep learning with PyTorch. It acts as both a step-by-step tutorial and a reference you'll keep coming back to as you build your machine learning systems.

Packed with clear explanations, visualizations, and examples, the book covers all the essential machine learning techniques in depth. While some books teach you only to follow instructions, with this machine learning book, we teach the principles allowing you to build models and applications for yourself.

Why PyTorch?

PyTorch is the Pythonic way to learn machine learning, making it easier to learn and simpler to code with. This book explains the essential parts of PyTorch and how to create models using popular libraries, such as PyTorch Lightning and PyTorch Geometric.

You will also learn about generative adversarial networks (GANs) for generating new data and training intelligent agents with reinforcement learning. Finally, this new edition is expanded to cover the latest trends in deep learning, including graph neural networks and large-scale transformers used for natural language processing (NLP).

This PyTorch book is your companion to machine learning with Python, whether you're a Python developer new to machine learning or want to deepen your knowledge of the latest developments.

What you will learn

  • Explore frameworks, models, and techniques for machines to learn from data
  • Use scikit-learn for machine learning and PyTorch for deep learning
  • Train machine learning classifiers on images, text, and more
  • Build and train neural networks, transformers, and boosting algorithms
  • Discover best practices for evaluating and tuning models
  • Predict continuous target outcomes using regression analysis
  • Dig deeper into textual and social media data using sentiment analysis

Who this book is for

If you have a good grasp of Python basics and want to start learning about machine learning and deep learning, then this is the book for you. This is an essential resource written for developers and data scientists who want to create practical machine learning and deep learning applications using scikit-learn and PyTorch. Before you get started with this book, you’ll need a good understanding of calculus, as well as linear algebra.

Table of contents

  1. Preface
    1. Who this book is for
    2. What this book covers
    3. To get the most out of this book
    4. Get in touch
    5. Share your thoughts
  2. Giving Computers the Ability to Learn from Data
    1. Building intelligent machines to transform data into knowledge
    2. The three different types of machine learning
      1. Making predictions about the future with supervised learning
        1. Classification for predicting class labels
        2. Regression for predicting continuous outcomes
      2. Solving interactive problems with reinforcement learning
      3. Discovering hidden structures with unsupervised learning
        1. Finding subgroups with clustering
        2. Dimensionality reduction for data compression
    3. Introduction to the basic terminology and notations
      1. Notation and conventions used in this book
      2. Machine learning terminology
    4. A roadmap for building machine learning systems
      1. Preprocessing – getting data into shape
      2. Training and selecting a predictive model
      3. Evaluating models and predicting unseen data instances
    5. Using Python for machine learning
      1. Installing Python and packages from the Python Package Index
      2. Using the Anaconda Python distribution and package manager
      3. Packages for scientific computing, data science, and machine learning
    6. Summary
  3. Training Simple Machine Learning Algorithms for Classification
    1. Artificial neurons – a brief glimpse into the early history of machine learning
      1. The formal definition of an artificial neuron
      2. The perceptron learning rule
    2. Implementing a perceptron learning algorithm in Python
      1. An object-oriented perceptron API
      2. Training a perceptron model on the Iris dataset
    3. Adaptive linear neurons and the convergence of learning
      1. Minimizing loss functions with gradient descent
      2. Implementing Adaline in Python
      3. Improving gradient descent through feature scaling
      4. Large-scale machine learning and stochastic gradient descent
    4. Summary
  4. A Tour of Machine Learning Classifiers Using Scikit-Learn
    1. Choosing a classification algorithm
    2. First steps with scikit-learn – training a perceptron
    3. Modeling class probabilities via logistic regression
      1. Logistic regression and conditional probabilities
      2. Learning the model weights via the logistic loss function
      3. Converting an Adaline implementation into an algorithm for logistic regression
      4. Training a logistic regression model with scikit-learn
      5. Tackling overfitting via regularization
    4. Maximum margin classification with support vector machines
      1. Maximum margin intuition
      2. Dealing with a nonlinearly separable case using slack variables
      3. Alternative implementations in scikit-learn
    5. Solving nonlinear problems using a kernel SVM
      1. Kernel methods for linearly inseparable data
      2. Using the kernel trick to find separating hyperplanes in a high-dimensional space
    6. Decision tree learning
      1. Maximizing IG – getting the most bang for your buck
      2. Building a decision tree
      3. Combining multiple decision trees via random forests
    7. K-nearest neighbors – a lazy learning algorithm
    8. Summary
  5. Building Good Training Datasets – Data Preprocessing
    1. Dealing with missing data
      1. Identifying missing values in tabular data
      2. Eliminating training examples or features with missing values
      3. Imputing missing values
      4. Understanding the scikit-learn estimator API
    2. Handling categorical data
      1. Categorical data encoding with pandas
      2. Mapping ordinal features
      3. Encoding class labels
      4. Performing one-hot encoding on nominal features
        1. Optional: encoding ordinal features
    3. Partitioning a dataset into separate training and test datasets
    4. Bringing features onto the same scale
    5. Selecting meaningful features
      1. L1 and L2 regularization as penalties against model complexity
      2. A geometric interpretation of L2 regularization
      3. Sparse solutions with L1 regularization
      4. Sequential feature selection algorithms
    6. Assessing feature importance with random forests
    7. Summary
  6. Compressing Data via Dimensionality Reduction
    1. Unsupervised dimensionality reduction via principal component analysis
      1. The main steps in principal component analysis
      2. Extracting the principal components step by step
      3. Total and explained variance
      4. Feature transformation
      5. Principal component analysis in scikit-learn
      6. Assessing feature contributions
    2. Supervised data compression via linear discriminant analysis
      1. Principal component analysis versus linear discriminant analysis
      2. The inner workings of linear discriminant analysis
      3. Computing the scatter matrices
      4. Selecting linear discriminants for the new feature subspace
      5. Projecting examples onto the new feature space
      6. LDA via scikit-learn
    3. Nonlinear dimensionality reduction and visualization
      1. Why consider nonlinear dimensionality reduction?
      2. Visualizing data via t-distributed stochastic neighbor embedding
    4. Summary
  7. Learning Best Practices for Model Evaluation and Hyperparameter Tuning
    1. Streamlining workflows with pipelines
      1. Loading the Breast Cancer Wisconsin dataset
      2. Combining transformers and estimators in a pipeline
    2. Using k-fold cross-validation to assess model performance
      1. The holdout method
      2. K-fold cross-validation
    3. Debugging algorithms with learning and validation curves
      1. Diagnosing bias and variance problems with learning curves
      2. Addressing over- and underfitting with validation curves
    4. Fine-tuning machine learning models via grid search
      1. Tuning hyperparameters via grid search
      2. Exploring hyperparameter configurations more widely with randomized search
      3. More resource-efficient hyperparameter search with successive halving
      4. Algorithm selection with nested cross-validation
    5. Looking at different performance evaluation metrics
      1. Reading a confusion matrix
      2. Optimizing the precision and recall of a classification model
      3. Plotting a receiver operating characteristic
      4. Scoring metrics for multiclass classification
      5. Dealing with class imbalance
    6. Summary
  8. Combining Different Models for Ensemble Learning
    1. Learning with ensembles
    2. Combining classifiers via majority vote
      1. Implementing a simple majority vote classifier
      2. Using the majority voting principle to make predictions
      3. Evaluating and tuning the ensemble classifier
    3. Bagging – building an ensemble of classifiers from bootstrap samples
      1. Bagging in a nutshell
      2. Applying bagging to classify examples in the Wine dataset
    4. Leveraging weak learners via adaptive boosting
      1. How adaptive boosting works
      2. Applying AdaBoost using scikit-learn
    5. Gradient boosting – training an ensemble based on loss gradients
      1. Comparing AdaBoost with gradient boosting
      2. Outlining the general gradient boosting algorithm
      3. Explaining the gradient boosting algorithm for classification
      4. Illustrating gradient boosting for classification
      5. Using XGBoost
    6. Summary
  9. Applying Machine Learning to Sentiment Analysis
    1. Preparing the IMDb movie review data for text processing
      1. Obtaining the movie review dataset
      2. Preprocessing the movie dataset into a more convenient format
    2. Introducing the bag-of-words model
      1. Transforming words into feature vectors
      2. Assessing word relevancy via term frequency-inverse document frequency
      3. Cleaning text data
      4. Processing documents into tokens
    3. Training a logistic regression model for document classification
    4. Working with bigger data – online algorithms and out-of-core learning
    5. Topic modeling with latent Dirichlet allocation
      1. Decomposing text documents with LDA
      2. LDA with scikit-learn
    6. Summary
  10. Predicting Continuous Target Variables with Regression Analysis
    1. Introducing linear regression
      1. Simple linear regression
      2. Multiple linear regression
    2. Exploring the Ames Housing dataset
      1. Loading the Ames Housing dataset into a DataFrame
      2. Visualizing the important characteristics of a dataset
      3. Looking at relationships using a correlation matrix
    3. Implementing an ordinary least squares linear regression model
      1. Solving regression for regression parameters with gradient descent
      2. Estimating the coefficient of a regression model via scikit-learn
    4. Fitting a robust regression model using RANSAC
    5. Evaluating the performance of linear regression models
    6. Using regularized methods for regression
    7. Turning a linear regression model into a curve – polynomial regression
      1. Adding polynomial terms using scikit-learn
      2. Modeling nonlinear relationships in the Ames Housing dataset
    8. Dealing with nonlinear relationships using random forests
      1. Decision tree regression
      2. Random forest regression
    9. Summary
  11. Working with Unlabeled Data – Clustering Analysis
    1. Grouping objects by similarity using k-means
      1. k-means clustering using scikit-learn
      2. A smarter way of placing the initial cluster centroids using k-means++
      3. Hard versus soft clustering
      4. Using the elbow method to find the optimal number of clusters
      5. Quantifying the quality of clustering via silhouette plots
    2. Organizing clusters as a hierarchical tree
      1. Grouping clusters in a bottom-up fashion
      2. Performing hierarchical clustering on a distance matrix
      3. Attaching dendrograms to a heat map
      4. Applying agglomerative clustering via scikit-learn
    3. Locating regions of high density via DBSCAN
    4. Summary
  12. Implementing a Multilayer Artificial Neural Network from Scratch
    1. Modeling complex functions with artificial neural networks
      1. Single-layer neural network recap
      2. Introducing the multilayer neural network architecture
      3. Activating a neural network via forward propagation
    2. Classifying handwritten digits
      1. Obtaining and preparing the MNIST dataset
      2. Implementing a multilayer perceptron
      3. Coding the neural network training loop
      4. Evaluating the neural network performance
    3. Training an artificial neural network
      1. Computing the loss function
      2. Developing your understanding of backpropagation
      3. Training neural networks via backpropagation
    4. About convergence in neural networks
    5. A few last words about the neural network implementation
    6. Summary
  13. Parallelizing Neural Network Training with PyTorch
    1. PyTorch and training performance
      1. Performance challenges
      2. What is PyTorch?
      3. How we will learn PyTorch
    2. First steps with PyTorch
      1. Installing PyTorch
      2. Creating tensors in PyTorch
      3. Manipulating the data type and shape of a tensor
      4. Applying mathematical operations to tensors
      5. Split, stack, and concatenate tensors
    3. Building input pipelines in PyTorch
      1. Creating a PyTorch DataLoader from existing tensors
      2. Combining two tensors into a joint dataset
      3. Shuffle, batch, and repeat
      4. Creating a dataset from files on your local storage disk
      5. Fetching available datasets from the torchvision.datasets library
    4. Building an NN model in PyTorch
      1. The PyTorch neural network module (torch.nn)
      2. Building a linear regression model
      3. Model training via the torch.nn and torch.optim modules
      4. Building a multilayer perceptron for classifying flowers in the Iris dataset
      5. Evaluating the trained model on the test dataset
      6. Saving and reloading the trained model
    5. Choosing activation functions for multilayer neural networks
      1. Logistic function recap
      2. Estimating class probabilities in multiclass classification via the softmax function
      3. Broadening the output spectrum using a hyperbolic tangent
      4. Rectified linear unit activation
    6. Summary
  14. Going Deeper – The Mechanics of PyTorch
    1. The key features of PyTorch
    2. PyTorch’s computation graphs
      1. Understanding computation graphs
      2. Creating a graph in PyTorch
    3. PyTorch tensor objects for storing and updating model parameters
    4. Computing gradients via automatic differentiation
      1. Computing the gradients of the loss with respect to trainable variables
      2. Understanding automatic differentiation
      3. Adversarial examples
    5. Simplifying implementations of common architectures via the torch.nn module
      1. Implementing models based on nn.Sequential
      2. Choosing a loss function
      3. Solving an XOR classification problem
      4. Making model building more flexible with nn.Module
      5. Writing custom layers in PyTorch
    6. Project one – predicting the fuel efficiency of a car
      1. Working with feature columns
      2. Training a DNN regression model
    7. Project two – classifying MNIST handwritten digits
    8. Higher-level PyTorch APIs: a short introduction to PyTorch-Lightning
      1. Setting up the PyTorch Lightning model
      2. Setting up the data loaders for Lightning
      3. Training the model using the PyTorch Lightning Trainer class
      4. Evaluating the model using TensorBoard
    9. Summary
  15. Classifying Images with Deep Convolutional Neural Networks
    1. The building blocks of CNNs
      1. Understanding CNNs and feature hierarchies
      2. Performing discrete convolutions
        1. Discrete convolutions in one dimension
        2. Padding inputs to control the size of the output feature maps
        3. Determining the size of the convolution output
        4. Performing a discrete convolution in 2D
      3. Subsampling layers
    2. Putting everything together – implementing a CNN
      1. Working with multiple input or color channels
      2. Regularizing an NN with L2 regularization and dropout
      3. Loss functions for classification
    3. Implementing a deep CNN using PyTorch
      1. The multilayer CNN architecture
      2. Loading and preprocessing the data
      3. Implementing a CNN using the torch.nn module
        1. Configuring CNN layers in PyTorch
        2. Constructing a CNN in PyTorch
    4. Smile classification from face images using a CNN
      1. Loading the CelebA dataset
      2. Image transformation and data augmentation
      3. Training a CNN smile classifier
    5. Summary
  16. Modeling Sequential Data Using Recurrent Neural Networks
    1. Introducing sequential data
      1. Modeling sequential data – order matters
      2. Sequential data versus time series data
      3. Representing sequences
      4. The different categories of sequence modeling
    2. RNNs for modeling sequences
      1. Understanding the dataflow in RNNs
      2. Computing activations in an RNN
      3. Hidden recurrence versus output recurrence
      4. The challenges of learning long-range interactions
      5. Long short-term memory cells
    3. Implementing RNNs for sequence modeling in PyTorch
      1. Project one – predicting the sentiment of IMDb movie reviews
        1. Preparing the movie review data
        2. Embedding layers for sentence encoding
        3. Building an RNN model
        4. Building an RNN model for the sentiment analysis task
      2. Project two – character-level language modeling in PyTorch
        1. Preprocessing the dataset
        2. Building a character-level RNN model
        3. Evaluation phase – generating new text passages
    4. Summary
  17. Transformers – Improving Natural Language Processing with Attention Mechanisms
    1. Adding an attention mechanism to RNNs
      1. Attention helps RNNs with accessing information
      2. The original attention mechanism for RNNs
      3. Processing the inputs using a bidirectional RNN
      4. Generating outputs from context vectors
      5. Computing the attention weights
    2. Introducing the self-attention mechanism
      1. Starting with a basic form of self-attention
      2. Parameterizing the self-attention mechanism: scaled dot-product attention
    3. Attention is all we need: introducing the original transformer architecture
      1. Encoding context embeddings via multi-head attention
      2. Learning a language model: decoder and masked multi-head attention
      3. Implementation details: positional encodings and layer normalization
    4. Building large-scale language models by leveraging unlabeled data
      1. Pre-training and fine-tuning transformer models
      2. Leveraging unlabeled data with GPT
      3. Using GPT-2 to generate new text
      4. Bidirectional pre-training with BERT
      5. The best of both worlds: BART
    5. Fine-tuning a BERT model in PyTorch
      1. Loading the IMDb movie review dataset
      2. Tokenizing the dataset
      3. Loading and fine-tuning a pre-trained BERT model
      4. Fine-tuning a transformer more conveniently using the Trainer API
    6. Summary
  18. Generative Adversarial Networks for Synthesizing New Data
    1. Introducing generative adversarial networks
      1. Starting with autoencoders
      2. Generative models for synthesizing new data
      3. Generating new samples with GANs
      4. Understanding the loss functions of the generator and discriminator networks in a GAN model
    2. Implementing a GAN from scratch
      1. Training GAN models on Google Colab
      2. Implementing the generator and the discriminator networks
      3. Defining the training dataset
      4. Training the GAN model
    3. Improving the quality of synthesized images using a convolutional and Wasserstein GAN
      1. Transposed convolution
      2. Batch normalization
      3. Implementing the generator and discriminator
      4. Dissimilarity measures between two distributions
      5. Using EM distance in practice for GANs
      6. Gradient penalty
      7. Implementing WGAN-GP to train the DCGAN model
      8. Mode collapse
    4. Other GAN applications
    5. Summary
  19. Graph Neural Networks for Capturing Dependencies in Graph Structured Data
    1. Introduction to graph data
      1. Undirected graphs
      2. Directed graphs
      3. Labeled graphs
      4. Representing molecules as graphs
    2. Understanding graph convolutions
      1. The motivation behind using graph convolutions
      2. Implementing a basic graph convolution
    3. Implementing a GNN in PyTorch from scratch
      1. Defining the NodeNetwork model
      2. Coding the NodeNetwork’s graph convolution layer
      3. Adding a global pooling layer to deal with varying graph sizes
      4. Preparing the DataLoader
      5. Using the NodeNetwork to make predictions
    4. Implementing a GNN using the PyTorch Geometric library
    5. Other GNN layers and recent developments
      1. Spectral graph convolutions
      2. Pooling
      3. Normalization
      4. Pointers to advanced graph neural network literature
    6. Summary
  20. Reinforcement Learning for Decision Making in Complex Environments
    1. Introduction – learning from experience
      1. Understanding reinforcement learning
      2. Defining the agent-environment interface of a reinforcement learning system
    2. The theoretical foundations of RL
      1. Markov decision processes
        1. The mathematical formulation of Markov decision processes
        2. Visualization of a Markov process
      2. Episodic versus continuing tasks
      3. RL terminology: return, policy, and value function
        1. The return
        2. Policy
        3. Value function
      4. Dynamic programming using the Bellman equation
    3. Reinforcement learning algorithms
      1. Dynamic programming
        1. Policy evaluation – predicting the value function with dynamic programming
        2. Improving the policy using the estimated value function
        3. Policy iteration
        4. Value iteration
      2. Reinforcement learning with Monte Carlo
        1. State-value function estimation using MC
        2. Action-value function estimation using MC
        3. Finding an optimal policy using MC control
        4. Policy improvement – computing the greedy policy from the action-value function
      3. Temporal difference learning
        1. TD prediction
        2. On-policy TD control (SARSA)
        3. Off-policy TD control (Q-learning)
    4. Implementing our first RL algorithm
      1. Introducing the OpenAI Gym toolkit
        1. Working with the existing environments in OpenAI Gym
        2. A grid world example
        3. Implementing the grid world environment in OpenAI Gym
      2. Solving the grid world problem with Q-learning
    5. A glance at deep Q-learning
      1. Training a DQN model according to the Q-learning algorithm
        1. Replay memory
        2. Determining the target values for computing the loss
      2. Implementing a deep Q-learning algorithm
    6. Chapter and book summary
  21. Other Books You May Enjoy
  22. Index

Product information

  • Title: Machine Learning with PyTorch and Scikit-Learn
  • Author(s): Sebastian Raschka, Yuxi Liu, Vahid Mirjalili
  • Release date: February 2022
  • Publisher(s): Packt Publishing
  • ISBN: 9781801819312