O'Reilly logo

High Performance Parallelism Pearls Volume One by James Jeffers, James Reinders

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 8

Optimizing Gather/Scatter Patterns

Simon J. Pennycook*; Christopher J. Hughes; Mikhail Smelyanskiy    * Intel Corporation, UK Intel Corporation, USA

Abstract

Many modern microarchitectures rely on single-instruction multiple-data execution to provide high compute ­capabilities in an energy efficient manner. Such microarchitectures—including those employed by the most recent Intel® Xeon® processors and Intel® Xeon Phi™ coprocessors—are optimized and/or better suited to dealing with contiguous loads and stores than non-contiguous loads (i.e., gathers) and stores (i.e., scatters). Gather and scatter behavior are more complex than that of contiguous loads and stores (e.g., it may depend on how close together the data items being read/written ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required