Coarse-Grained OpenMP for Scalable Hybrid Parallelism
Enda O’Brien Irish Centre for High-End Computing (ICHEC), Ireland
This chapter illustrates the benefit of using OpenMP parallelism in a more “coarse-grained” way. This requires inserting directives at the highest possible level in source code, and using domain decomposition concepts that are closely analogous to those of MPI, so that multiple copies of thread-local arrays do not lead to excessive memory consumption. On massively parallel, heterogeneous hardware systems, an efficient nesting of such coarse-grained OpenMP within distributed-memory MPI parallelism may be the best approach for obtaining optimal performance from a large class of applications. Examples ...
Get High Performance Parallelism Pearls Volume Two now with O’Reilly online learning.
O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.