The source code for the CUDA implementation of convolutional nets is in two files, both of which can be downloaded for free from my web site. MOD_CUDA.CPP provides the high-level organization. It calls subroutines to initialize, compute forward activation, backpropagate delta, and compute the gradient. MOD_CUDA.cu contains the CUDA device routines, as well as the low-level C++ host routines that are called from MOD_CUDA.CPP and that in turn launch computation kernels and provide communication between the host and the device.
Many excellent books on CUDA programming exist. ...