O'Reilly logo

High Performance Parallelism Pearls Volume One by James Jeffers, James Reinders

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 24

Profiling-Guided Optimization

Andrey Vladimirov    Colfax International, USA

Abstract

The chapter focuses on a matrix transposition, a small and self-contained workload of great practical value. The optimization process applied to the code relies exclusively on programming in a high-level language plus utilization of the OpenMP framework. The result is a portable code that can run on both CPU (processor) and MIC (coprocessor) architectures, and can be recompiled for future generations of Intel architectures. The focus of the chapter is on the use of Intel® VTune™ Amplifier XE reports to understand where to apply optimization. Through VTune, the performance monitoring functionality of Intel Xeon Phi coprocessors is showcased not only ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required