188 Designing Scientific Applications on GPUs
9.4.1 Data placement on a hierarchical memory
During the execution of metaheuristics on GPU, the different threads may
access multiple data structures from multiple memory spaces. These mem-
ories have different sizes and access latencies. Nevertheless, faster memories
(registers, shared and constant memories) are generally very small in size,
and the larger memories (global memory) are relatively slower. However, p-
metaheuristics require the exploration of a large amount of individuals to di-
versify the search. Moreover, an efficient execution of s-metaheuristics requires
exploring large neighborhoods. Thus, programmers have to take into account
this point to efficiently place the different data structures of ...