Skip to Content
Distributed Machine Learning with Python
book

Distributed Machine Learning with Python

by Guanhua Wang
April 2022
Intermediate to advanced
284 pages
5h 53m
English
Packt Publishing
Content preview from Distributed Machine Learning with Python

Chapter 5: Splitting the Model

In this chapter, we will discuss how to train giant models with model parallelism. Giant models refers to models that are too large to fit into a single GPU's memory. Some examples of giant models include Bidirectional Encoder Representations from Transformers (BERT), Generative Pre-Trainer Transformer (GPT): GPT-2 and GPT-3.

In contrast to data parallel workloads, model parallelism is often adopted for language models. Language models are a specific type of deep learning model that works in the Natural Language Processing (NLP) domain. Here, the input data is usually text sequences. The model outputs predictions for tasks such as question answering and next sentence prediction.

NLP model training is often segregated ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Interpretable Machine Learning with Python

Interpretable Machine Learning with Python

Serg Masís
Distributed Computing with Python

Distributed Computing with Python

Francesco Pierfederici

Publisher Resources

ISBN: 9781801815697Supplemental Content