Case study – matrix multiplication

As we discussed earlier, kernel is just similar to a C function. Each work item will execute this function on the device. We here discuss different optimization strategies and implementations of kernels based on them. In this chapter we present matrix multiplication example to illustrate those optimization strategies with few advantages and disadvantages of them. We need to keep in mind that all the techniques are not applicable to all the problems and also, unfortunately, sometimes they are even in conflict.

Sequential implementation

For the sake of simplicity we take two square matrices called A and B to multiply (each 1024 by 1024) as input and as a result get a square matrix say C of same size (1024 by 1024). ...

Get OpenCL Programming by Example now with the O’Reilly learning platform.

O’Reilly members experience live online training, plus books, videos, and digital content from nearly 200 publishers.