Profiling the PyTorch model

In PyTorch, we can place a custom tag using torch.cuda.nvtx.range_push("foo") and torch.cuda.nvtx.range_pop(). This maintains the original CUDA NVTX APIs, that is, nvtxRangePush() and nvtxRangePop(). Let's see how NVTX annotations can help us understand deep learning operations in the timeline. In the following steps, we will use the ResNet-50 example code in the 05_framework_profile/pytorch/RN50v1.5 file:

  1. We will place NVTX annotations in the training loop in the train() function to annotate the step value. This function can be found in the image_classificaiton/training.py file. The following screenshot shows the training loop and the NVTX annotations at line 234 and line 260, respectively:

In the preceding ...

Get Learn CUDA Programming now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.