O'Reilly logo

Heterogeneous Computing with OpenCL, 2nd Edition by Dana Schaa, Perhaad Mistry, David R. Kaeli, Lee Howes, Benedict Gaster

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 9

OpenCL Case StudyHistogram

Introduction

This chapter discusses specific optimizations for a memory-bound kernel. The kernel we choose for this chapter is an image histogram operation. The source data for this operation is an 8-bit-per-pixel image with a target of counting into each of 256 32-bit histogram bins.

The principle of the histogram algorithm is to perform the following operation over each element of the image:

 for(many input values) {

   histogram[value]++;

 }

This algorithm performs many scattered read-modify-write accesses into a small histogram data structure. On a CPU, this application will use the cache, although with a high rate of reuse of elements. On a GPU, these accesses will be resolved in global memory, which will ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required