Chapter 19


There and Back Again

Matthias Noack*; Florian Wende*; Klaus-Dieter Oertel    * Zuse Institute Berlin, Germany Intel Corporation, Germany


The chapter presents a case study on optimizing the Hexciton kernel of the GPU-HEOM code for parallelism. The HEOM method bridges biology and quantum physics to simulate molecular light-harvesting complexes. The Hexciton kernel computes a commutator term for a large set of small, complex matrices, which is relevant in other domains too. Starting with a naive reference implementation, the chapter develops a fully optimized OpenCL kernel by analyzing different techniques. The chapter compares automatic and manual vectorization techniques to optimize the memory layout for contiguous ...

Get High Performance Parallelism Pearls Volume Two now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.