Work Sharing and Domain Decomposition

In this chapter we consider the distribution of workload across a number of UPC SPMD threads by decomposing the data across the threads that are executing. Distributing work across a number of threads (or processes) in a parallel application requires that each thread have the ability to identity itself through an id value and to recognize the other threads, possibly remote, available to cooperate on performing the application task. This self-identification and recognition of the other cooperating threads allows a division of labor by identifying the partition of the data that will be manipulated by each thread. In UPC, variable declarations establish the affinity of data and threads, providing locality information and control to programmers for decomposing their workload more intelligently. UPC programmers can take advantage of this knowledge by assigning each thread to apply its work, as appropriate, on the data that has affinity to that thread. In this way, the majority of the accesses can become thread local. On machines with physically distributed memory, it is expected that compilers will attempt to co-locate each thread and the data that has affinity to it onto the same physical node, thereby reducing remote accesses and improving execution time. Moreover, as in other parallel programming paradigms, UPC offers the ability for each thread to identify itself and the rest of threads available to help through the special constants ...

Get UPC: DISTRIBUTED SHARED MEMORY PROGRAMMING now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.