Chapter 30

GPU Application Development, Debugging, and Performance Tuning with GPU Ocelot

Andrew Kerr, Gregory Diamos and Sudhakar Yalamanchili

This chapter will discuss some implementation details of GPU Ocelot, particularly the implementation of the PTX emulator, and how GPU Ocelot may be used to prototype, debug, and tune CUDA applications for efficient execution on GPUs. This gem will explain how users may benefit from the rich application profiling and correctness tools built into Ocelot as well as how to extend Ocelot’s trace generator interface to perform custom workload characterization and profiling. Additionally, we will discuss GPU Ocelot’s role as a dynamic compilation framework for heterogeneous many-core compute systems that leverage ...

Get GPU Computing Gems Jade Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.