Chapter 7. OpenACC and Performance Portability
Graham Lopez and Oscar Hernandez, Oak Ridge National Laboratory
This chapter discusses the performance portability of directives provided by OpenACC to program various types of machine architectures. This includes nodes with attached accelerators: self-hosted multicores (e.g., multicore-only systems such as the Intel Xeon Phi) as well as GPUs. Our goal is to explain how to successfully use OpenACC for moving code between architectures, how much tuning might be required to do so, and what lessons we can learn from writing performance portable code. We use examples of algorithms with varying computational intensities for our evaluation, because both compute and data-access efficiency are important ...
Get OpenACC for Programmers: Concepts and Strategies, First Edition now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.