Summary
We first saw how to query our GPU from PyCUDA, and with this re-create the CUDA deviceQuery program in Python. We then learned how to transfer NumPy arrays to and from the GPU's memory with the PyCUDA gpuarray class and its to_gpu and get functions. We got a feel for using gpuarray objects by observing how to use them to do basic calculations on the GPU, and we learned to do a little investigative work using IPython's prun profiler. We saw there is sometimes some arbitrary slowdown when running GPU functions from PyCUDA for the first time in a session, due to PyCUDA launching NVIDIA's nvcc compiler to compile inline CUDA C code. We then saw how to use the ElementwiseKernel function to compile and launch element-wise operations, which ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access