Chapter 9. Gaining Practical Expertise with Scaling Across All Dimensions
Chapter 8 discussed the theoretical concepts and foundational knowledge you need to scale model training beyond data parallelism, exploring techniques for model, pipeline, tensor, and hybrid parallelism. This chapter continues that discussion and provides practical experience with using these distributed training paradigms. You will also review some tools and libraries that are useful in vertical scaling and further explore DeepSpeed (introduced in Hands-On Exercise #5 in Chapter 7) through a vertical scaling lens. At the end of the chapter, you’ll find a practical exercise to achieve more automated multidimensional hybrid training using DeepSpeed.
Hands-On Exercises: Model, Tensor, Pipeline, and Hybrid Parallelism
In this series of Hands-On Exercises, you will build a recommendation engine for movies. You will be leveraging the DeepFM model to explore simplistic implementations of vertical scaling. Please note that in order to make the implementations simpler and easier to follow, the use of monitoring and profiling tools has largely been omitted from these exercises. However, the tools and software discussed in Chapters 4 and 7 are equally applicable and useful for profiling and benchmarking model, pipeline, and hybrid parallel programs too.
The Dataset
The movie recommender will be based on the MovieLens dataset, open sourced by GroupLens Research. This dataset has two parts: the ratings of the movies ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access