Machine Learning Q and AI

7MULTI-GPU TRAINING PARADIGMS

What are the different multi-GPU training paradigms, and what are their respective advantages and disadvantages?

Multi-GPU training paradigms can be categorized into two groups: dividing data for parallel processing with multiple GPUs and dividing the model among multiple GPUs to handle memory constraints when the model size surpasses that of a single GPU. Data parallelism falls into the first category, while model parallelism and tensor parallelism fall into the second category. Techniques like pipeline parallelism borrow ideas from both categories. In addition, current software implementations such as DeepSpeed, ...

Get Machine Learning Q and AI now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Machine Learning Q and AI by Sebastian Raschka

7MULTI-GPU TRAINING PARADIGMS

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly