Chapter 7

Parallel patterns: convolution

An introduction to stencil computation


This chapter presents convolution as an important parallel computation pattern. While convolution is used in many applications such as computer vision and video processing, it also represents a general pattern that forms the basis of many parallel algorithms. We start with the concept of convolution. We then present a basic parallel convolution algorithm whose execution speed is limited by DRAM bandwidth for accessing both the input and mask elements. We then introduced the constant memory and a simple modification to the kernel and host code to practically eliminate all the DRAM accesses. This is followed by an input tiling kernel that eliminates most of ...

Get Programming Massively Parallel Processors, 3rd Edition now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.