Preparation and functional testing
This chapter provides more information about how to prepare the environment, that is, configuring persistent volumes (PVs) and defining YAML for running Message Passing Interface (MPI) jobs. It also provides baseline tests to ensure proper network performance and multi-node and multi-GPU communications tests with the NVIDIA Collective Communications Library (NCCL). The chapter concludes with initial GPU scaling tests running the ResNet-50 benchmark on synthetic data.
4.1 Testing remote direct memory access through ...

Get Deployment and Usage Guide for Running AI Workloads on Red Hat OpenShift and NVIDIA DGX Systems with IBM Spectrum Scale now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.