Mastering Azure Machine Learning

Book description

Master expert techniques for building automated and highly scalable end-to-end machine learning models and pipelines in Azure using TensorFlow, Spark, and Kubernetes

Key Features

  • Make sense of data on the cloud by implementing advanced analytics
  • Train and optimize advanced deep learning models efficiently on Spark using Azure Databricks
  • Deploy machine learning models for batch and real-time scoring with Azure Kubernetes Service (AKS)

Book Description

The increase being seen in data volume today requires distributed systems, powerful algorithms, and scalable cloud infrastructure to compute insights and train and deploy machine learning (ML) models. This book will help you improve your knowledge of building ML models using Azure and end-to-end ML pipelines on the cloud.

The book starts with an overview of an end-to-end ML project and a guide on how to choose the right Azure service for different ML tasks. It then focuses on Azure ML and takes you through the process of data experimentation, data preparation, and feature engineering using Azure ML and Python. You'll learn advanced feature extraction techniques using natural language processing (NLP), classical ML techniques, and the secrets of both a great recommendation engine and a performant computer vision model using deep learning methods. You'll also explore how to train, optimize, and tune models using Azure AutoML and HyperDrive, and perform distributed training on Azure ML. Then, you'll learn different deployment and monitoring techniques using Azure Kubernetes Services with Azure ML, along with the basics of MLOps—DevOps for ML to automate your ML process as CI/CD pipeline.

By the end of this book, you'll have mastered Azure ML and be able to confidently design, build and operate scalable ML pipelines in Azure.

What you will learn

  • Setup your Azure ML workspace for data experimentation and visualization
  • Perform ETL, data preparation, and feature extraction using Azure best practices
  • Implement advanced feature extraction using NLP and word embeddings
  • Train gradient boosted tree-ensembles, recommendation engines and deep neural networks on Azure ML
  • Use hyperparameter tuning and AutoML to optimize your ML models
  • Employ distributed ML on GPU clusters using Horovod in Azure ML
  • Deploy, operate and manage your ML models at scale
  • Automated your end-to-end ML process as CI/CD pipelines for MLOps

Who this book is for

This machine learning book is for data professionals, data analysts, data engineers, data scientists, or machine learning developers who want to master scalable cloud-based machine learning architectures in Azure. This book will help you use advanced Azure services to build intelligent machine learning applications. A basic understanding of Python and working knowledge of machine learning are mandatory.

Table of contents

  1. Preface
    1. About Mastering Azure Machine Learning
      1. About the authors
      2. About the reviewers
      3. Learning objectives
      4. Audience
      5. Approach
      6. To get the most out of this book
      7. Conventions
      8. Download resources
  2. Section 1: Azure Machine Learning
  3. 1. Building an end-to-end machine learning pipeline in Azure
    1. Performing descriptive data exploration
      1. Moving data to the cloud
      2. Understanding missing values
      3. Visualizing data distributions
      4. Finding correlated dimensions
      5. Measuring feature and target dependencies for regression
      6. Visualizing feature and label dependency for classification
    2. Exploring common techniques for data preparation
      1. Labeling the training data
      2. Normalization and transformation in machine learning
      3. Encoding categorical variables
      4. A feature engineering example using time-series data
      5. Using NLP to extract complex features from text
    3. Choosing the right ML model to train data
      1. Choosing an error metric
      2. The training and testing split
      3. Achieving great performance using tree-based ensemble models
      4. Modeling large and complex data using deep learning techniques
    4. Optimization techniques
      1. Hyperparameter optimization
      2. Model stacking
      3. Azure Automated Machine Learning
    5. Deploying and operating models
      1. Batch scoring using pipelines
      2. Real-time scoring using a container-based web service
      3. Tracking model performance, telemetry, and data skew
    6. Summary
  4. 2. Choosing a machine learning service in Azure
    1. Demystifying the different Azure services for ML
      1. Choosing an Azure service for ML
      2. Choosing a compute target for Azure Machine Learning
    2. Azure Cognitive Services and Custom Vision
      1. Azure Cognitive Services
      2. Custom Vision—customizing the Cognitive Services API
    3. Azure Machine Learning with GUIs
      1. Azure Machine Learning designer
      2. Azure Automated Machine Learning
      3. Microsoft Power BI
    4. Azure Machine Learning workspace
      1. Organizing experiments and models in Azure Machine Learning
      2. Deployments through Azure Machine Learning
    5. Summary
  5. Section 2: Experimentation and Data Preparation
  6. 3. Data experimentation and visualization using Azure
    1. Preparing your Azure Machine Learning workspace
      1. Setting up the ML Service workspace
      2. Running a simple experiment with Azure Machine Learning
      3. Logging metrics and tracking results
      4. Scheduling and running scripts
      5. Adding cloud compute to the workspace
    2. Visualizing high-dimensional data
      1. Tracking figures in experiments in Azure Machine Learning
      2. Unsupervised dimensionality reduction with PCA
      3. Using LDA for supervised projections
      4. Non-linear dimension reduction with t-SNE
      5. Generalizing t-SNE with UMAP
    3. Summary
  7. 4. ETL, data preparation, and feature extraction
    1. Managing data and datasets in the cloud
      1. Getting data into the cloud
      2. Managing data in Azure Machine Learning
      3. Exploring data registered in Azure Machine Learning
    2. Preprocessing and feature engineering with Azure Machine Learning DataPrep
      1. Parsing different data formats
      2. Building a data transformation pipeline in Azure Machine Learning
    3. Summary
  8. 5. Azure Machine Learning pipelines
    1. Benefits of pipelines for ML workflows
      1. Why build pipelines?
      2. What are Azure Machine Learning pipelines?
    2. Building and publishing an ML pipeline
      1. Creating a simple pipeline
      2. Connecting data inputs and outputs between steps
      3. Publishing, triggering, and scheduling a pipeline
      4. Parallelizing steps to speed up large pipelines
      5. Reusing pipeline steps through modularization
    3. Integrating pipelines with other Azure services
      1. Building pipelines with the Azure Machine Learning designer
      2. Azure Machine Learning pipelines in Azure Data Factory
      3. Azure Pipelines for CI/CD
    4. Summary
  9. 6. Advanced feature extraction with NLP
    1. Understanding categorical data
      1. Comparing textual, categorical, and ordinal data
      2. Transforming categories into numeric values
      3. Categories versus text
    2. Building a simple bag-of-words model
      1. A naive bag-of-words model using counting
      2. Tokenization – turning a string into a list of words
      3. Stemming – rule-based removal of affixes
      4. Lemmatization – dictionary-based word normalization
      5. A bag-of-words model in scikit-learn
    3. Leveraging term importance and semantics
      1. Generalizing words using n-grams and skip- grams
      2. Reducing word dictionary size using SVD
      3. Measuring the importance of words using tf-idf
      4. Extracting semantics using word embeddings
    4. Implementing end-to-end language models
      1. End-to-end learning of token sequences
      2. State-of-the-art sequence-to-sequence models
      3. Text analytics using Azure Cognitive Services
    5. Summary
  10. Section 3: Training Machine Learning Models
  11. 7. Building ML models using Azure Machine Learning
    1. Working with tree-based ensemble classifiers
      1. Understanding a simple decision tree
      2. Combining classifiers with bagging
      3. Optimizing classifiers with boosting rounds
    2. Training an ensemble classifier model using LightGBM
      1. LightGBM in a nutshell
      2. Preparing the data
      3. Setting up the compute cluster and execution environment
      4. Building a LightGBM classifier
      5. Scheduling the training script on the Azure Machine Learning cluster
    3. Summary
  12. 8. Training deep neural networks on Azure
    1. Introduction to deep learning
      1. Why DL?
      2. From neural networks to DL
      3. Comparing classical ML and DL
    2. Training a CNN for image classification
      1. Training a CNN from scratch in your notebook
      2. Generating more input data using augmentation
      3. Moving training to a GPU cluster using Azure Machine Learning compute
      4. Improving your performance through transfer learning
    3. Summary
  13. 9. Hyperparameter tuning and Automated Machine Learning
    1. Hyperparameter tuning to find the optimal parameters
      1. Sampling all possible parameter combinations using grid search
      2. Trying random combinations using random search
      3. Converging faster using early termination
      4. Optimizing parameter choices using Bayesian optimization
    2. Finding the optimal model with Azure Automated Machine Learning
      1. Advantages and benefits of Azure Automated Machine Learning
      2. A classification example
    3. Summary
  14. 10. Distributed machine learning on Azure
    1. Exploring methods for distributed ML
      1. Training independent models on small data in parallel
      2. Training a model ensemble on large datasets in parallel
      3. Fundamental building blocks for distributed ML
      4. Speeding up DL with data-parallel training
      5. Training large models with model-parallel training
    2. Using distributed ML in Azure
      1. Horovod—a distributed DL training framework
      2. Implementing the HorovodRunner API for a Spark job
      3. Running Horovod on Azure Machine Learning compute
    3. Summary
  15. 11. Building a recommendation engine in Azure
    1. Introduction to recommender engines
    2. Content-based recommendations
      1. Measuring similarity between items
      2. Feature engineering for content-based recommenders
      3. Content-based recommendations using gradient boosted trees
    3. Collaborative filtering—a rating-based recommendation engine
      1. What is a rating? Explicit feedback as opposed to implicit feedback
      2. Predicting the missing ratings to make a recommendation
      3. Scalable recommendations using ALS factorization
    4. Combining content and ratings in hybrid recommendation engines
      1. Building a state-of-the-art recommender using the Matchbox Recommender
    5. Automatic optimization through reinforcement learning
      1. An example using Azure Personalizer in Python
    6. Summary
  16. Section 4: Optimization and Deployment of Machine Learning Models
  17. 12. Deploying and operating machine learning models
    1. Deploying ML models in Azure
      1. Understanding the components of an ML model
      2. Registering your models in a model registry
      3. Customizing your deployment environment
      4. Choosing a deployment target in Azure
    2. Building a real-time scoring service
    3. Implementing a batch scoring pipeline
    4. Inference optimizations and alternative deployment targets
      1. Profiling models for optimal resource configuration
      2. Portable scoring through the ONNX runtime
      3. Fast inference using FPGAs in Azure
      4. Alternative deployment targets
    5. Monitoring Azure Machine Learning deployments
      1. Collecting logs and infrastructure metrics
      2. Tracking telemetry and application metrics
    6. Summary
  18. 13. MLOps—DevOps for machine learning
    1. Ensuring reproducible builds and deployments
      1. Version-controlling your code
      2. Registering snapshots of your data
      3. Tracking your model metadata and artifacts
      4. Scripting your environments and deployments
    2. Validating your code, data, and models
      1. Rethinking unit testing for data quality
      2. Integration testing for ML
      3. End-to-end testing using Azure Machine Learning
      4. Continuous profiling of your model
    3. Summary
  19. 14. What's next?
    1. Understanding the importance of data
    2. The future of ML is automated
    3. Change is the only constant – preparing for change
    4. Focusing first on infrastructure and monitoring
    5. Controlled rollouts and A/B testing
    6. Summary
  20. Index

Product information

  • Title: Mastering Azure Machine Learning
  • Author(s): Christoph Korner, Kaijisse Waaijer
  • Release date: April 2020
  • Publisher(s): Packt Publishing
  • ISBN: 9781789807554