Summary
We started out in this chapter by seeing how printf can be used within a CUDA kernel to output data from individual threads; we saw in particular how useful this can be for debugging code. We then covered some of the gaps in our knowledge in CUDA-C, so that we can write full test programs that we can compile into proper executable binary files: there is a lot of overhead here that was hidden from us before that we have to be meticulous about. Next, we saw how to create and compile a project in the Nsight IDE and how to use it for debugging. We saw how to stop at any breakpoint we set in a CUDA kernel and switch between individual threads to see the different local variables. We also used the Nsight debugger to learn about the warp ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access