Skip to Content
Distributed Machine Learning with Python
book

Distributed Machine Learning with Python

by Guanhua Wang
April 2022
Intermediate to advanced
284 pages
5h 53m
English
Packt Publishing
Content preview from Distributed Machine Learning with Python

Chapter 1: Splitting Input Data

Over the recent years, data has grown drastically in size. For instance, if you take the computer vision domain as an example, datasets such as MNIST and CIFAR-10/100 consist of only 50k training images each, whereas recent datasets such as ImageNet-1k contain over 1 million training images. However, having a larger input data size leads to a much longer model training time on a single GPU/node. In the example mentioned previously, the total training time of a useable state-of-the-art single GPU training model on a CIFAR-10/100 dataset only takes a couple of hours. However, when it comes to the ImageNet-1K dataset, the training time for a GPU model will take days or even weeks.

The standard practice for speeding ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Interpretable Machine Learning with Python

Interpretable Machine Learning with Python

Serg Masís
Distributed Computing with Python

Distributed Computing with Python

Francesco Pierfederici

Publisher Resources

ISBN: 9781801815697Supplemental Content