Chapter 14

Debugging

Abstract

One of the most important skills a supercomputer practitioner needs to acquire is how to debug code written for parallel execution. Debugging a code written for parallel execution can be significantly more complicated than debugging a serial code. In addition to all the bugs a serial code may have, a parallel code adds to these race conditions, Heisenbugs, and deadlocks, to name a few. This chapter presents several tools and practices to help the practitioner develop the skills necessary to debug parallel codes and take advantage of the available debuggers, compiler flags, and system monitors.

Keywords

Breakpoints; Catchpoints; Compiler flags; Deadlocks; Debugger; Distributed debugging; Memory leaks; Multithreaded debugging; ...

Get High Performance Computing now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.