April 2022
Intermediate to advanced
284 pages
5h 53m
English
In this chapter, we will discuss how to implement a simple model parallelism pipeline. As opposed to data parallelism, where each GPU holds a full copy of a model, in model parallelism, we need to split a model properly among all GPUs in use.
Before diving into the details, we'll qualify our discussion with the following assumptions about both hardware and workload: