
7
IV
Optimizing OpenCL Kernels for
the ARM
Mali -T600 GPUs
Johan Gronqvist and Anton Lokhmotov
7.1 Introduction
OpenCL is a relatively young industry-backed standard API that aims to provide
functional portability across systems equipped with computational accelerators
such as GPUs: a standard-conforming OpenCL program can be executed on any
standard-conforming OpenCL implementation.
OpenCL, however, does not address the issue of performance portability:trans-
forming an OpenCL program to achieve higher performance on one device may
actually lead to lower performance on another device, since performance may de-
pend significantly on low-level details, suc ...