O'Reilly logo

High Performance Parallelism Pearls Volume One by James Jeffers, James Reinders

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 12

Concurrent Kernel Offloading

Florian Wende*; Michael Klemm; Thomas Steinke*; Alexander Reinefeld*    * Zuse Institute Berlin, Germany Intel, Germany

Abstract

This chapter describes the principle of concurrent kernel offloading to the coprocessor and the aspects which need be considered for optimizing the performance. Concurrent kernel offload targets application scenarios with many small-scale workloads that cannot exploit the provided resources on their own. This chapter explains how the computational throughput for multiple small-scale workloads can be improved on the Intel Xeon Phi coprocessor by concurrent kernel execution using the offload programming model. Each of the optimization steps are elaborated and illustrated by ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required