Callgrind is a call-graph generating profiler that also collects information about processor cache hit rate and branch prediction. Callgrind is only useful if your bottleneck is CPU-bound. It's not useful if heavy I/O or multiple processes are involved.

Valgrind does not require kernel configuration but it does need debug symbols. It is available as a target package in both the Yocto Project and Buildroot (BR2_PACKAGE_VALGRIND).

You run Callgrind in Valgrind on the target, like so:

# valgrind --tool=callgrind <program>

This produces a file called callgrind.out.<PID> which you can copy to the host and analyze with callgrind_annotate.

The default is to capture data for all the threads together in a single file. If you add option --separate-threads=yes ...

Get Mastering Embedded Linux Programming now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.