Performance Optimization of an Image Analysis Application
Chapter 13 discussed step by step how to use AMD CodeAnalyst, Profiler, gDEBugger, and KernelAnalyzer to profile and debug an OpenCL application. While Chapter 13 gave a very basic introduction to the tools, in this chapter, we use a real-world application as an example to walk through the steps from migrating a single-threaded application to one that utilizes the GPU and APU power using OpenCL. We will see how some of the profiling techniques that these, and other, tools provide can be used to investigate bottlenecks and improve peak performance of an application. After all, high performance is generally the reason to put time into porting code to use OpenCL.