Chapter 5. Stencils and Shared Memory
In this chapter, we look at applications involving computational threads that, instead of being independent, are inter dependent with other threads in their neighborhood on the computational grid. The mathematical models we will implement involve convolution (or correlation) operations in which data from threads in a neighborhood contribute to a linear combination with a constant array of coefficients. In the computing context, the operation is often referred to as filtering, and the coefficient array is a filter or stencil. Thread interactions can produce bottlenecks associated with multiple threads competing for access to the same data, so CUDA provides some capabilities for alleviating the bottlenecks ...
Get CUDA for Engineers: An Introduction to High-Performance Parallel Computing now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.