Chapter 6. Dissecting a CPU/GPU OpenCL Implementation
This chapter discusses a specific mapping of OpenCL to the combination of a Phenom II CPU and a Radeon 6970 GPU. The aim is to show how OpenCL's model maps in a specific case to give the reader some context about its execution on real hardware. We also discuss some optimizations necessary for efficient execution on such hardware.
Keywords APU, CPU, GPU, optimization, Phenom, Radeon

Introduction

In Chapter 3, we discussed trade-offs present in different architectures, many of which support the execution of OpenCL programs. The design of OpenCL is such that the model maps capably to a wide range of architectures, allowing for tuning and acceleration of kernel code. In this chapter, we discuss OpenCL's ...

Get Heterogeneous Computing with OpenCL now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.