Chapter 11. Vectorization, SIMD Instructions, and Additional Parallel Libraries


  • Understanding SIMD and vectorization

  • Understanding extended instruction sets

  • Working with Intel Math Kernel Library

  • Working with multicore-ready, highly optimized software functions

  • Mixing task-based programming with external optimized libraries

  • Generating pseudo-random numbers in parallel

  • Working with the ThreadLocal<T> class

  • Using Intel Integrated Performance Primitives

In the previous 10 chapters, you learned to create and coordinate code that runs many tasks in parallel to improve performance. If you want to improve throughput even further, you can take advantage of other possibilities offered by modern hardware related to parallelism. This chapter is about the usage of additional performance libraries and includes examples of their integration with .NET Framework 4 and the new task-based programming model. In addition, the chapter provides examples of the usage of the new thread-local storage classes and the lazy-initialization capabilities provided by these classes.


The "Parallel Programming and Multicore Programming" section of Chapter 1, "Task-Based Programming," introduced the different kinds of parallel architectures. This section also explained that most modern microprocessors can execute Single Instruction, Multiple Data (SIMD) instructions. Because the execution units for SIMD instructions usually belong to a physical core, it is possible ...

Get Professional Parallel Programming with C#: Master Parallel Extensions With .NET 4 now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.