July 2021
Intermediate to advanced
704 pages
21h 6m
English
This chapter covers
Processors have special vector units that can load and operate on more than one data element at a time. If we’re limited by floating-point operations, it is absolutely necessary to use vectorization to reach peak hardware capabilities. Vectorization is the process of grouping operations together so more than one can be done at a time. But, adding more flops to hardware capability when an application is memory bound has limited benefit. Take note, most applications are memory bound. Compilers can be powerful, but as ...