Note from the Lead Developer of Intel Threading Building Blocks
Parallel computing has become personal, in both liberating and demanding ways.
I remember using an IBM 1130 mainframe in high school in the 1970s, and how frustrating it was because only one person could use the machine at a time, feeding it via a card reader. Then, in the 1980s, computers became personal, and I had all the time I wanted to run and debug sequential programs.
Parallel computing has undergone a similar shift. In the 1980s and 1990s, parallel computers were institutional. They were fascinating to program, but access was limited. I was fortunate at one point to share a 256-processor nCUBE with only a few other people. Now, with multi-core chips, every programmer can have cheap access to a parallel computer—perhaps not with 256 processors yet, but with a growing number every year.
The downside, of course, is that parallel programming is no longer optional because parallel computers have become personal for consumers, too. Now parallel programming is mandatory for performance-sensitive applications.
There is no one true way to do parallel programming. Many paradigms have been proposed and have been cast in the form of new languages, language extensions, and libraries. One such paradigm defines tasks that run in shared memory. This paradigm is well suited to achieving parallel speedup on multi-core chips. The key notion is to separate logical task patterns from physical threads, and to delegate task scheduling ...