Chapter 12

Concurrent Kernel Offloading

Florian Wende*; Michael Klemm^†; Thomas Steinke*; Alexander Reinefeld* ^* Zuse Institute Berlin, Germany^† Intel, Germany

Abstract

This chapter describes the principle of concurrent kernel offloading to the coprocessor and the aspects which need be considered for optimizing the performance. Concurrent kernel offload targets application scenarios with many small-scale workloads that cannot exploit the provided resources on their own. This chapter explains how the computational throughput for multiple small-scale workloads can be improved on the Intel Xeon Phi coprocessor by concurrent kernel execution using the offload programming model. Each of the optimization steps are elaborated and illustrated by ...

Get High Performance Parallelism Pearls Volume One now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

High Performance Parallelism Pearls Volume One by James Reinders, James Jeffers

Concurrent Kernel Offloading

Abstract

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly