6 Vectorization: FLOPs for free

This chapter covers

  • The importance of vectorization
  • The kind of parallelization provided by a vector unit
  • Different ways you can access vector parallelization
  • Performance benefits you can expect

Processors have special vector units that can load and operate on more than one data element at a time. If we’re limited by floating-point operations, it is absolutely necessary to use vectorization to reach peak hardware capabilities. Vectorization is the process of grouping operations together so more than one can be done at a time. But, adding more flops to hardware capability when an application is memory bound has limited benefit. Take note, most applications are memory bound. Compilers can be powerful, but as ...

Get Parallel and High Performance Computing now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.