
5. Non-separable 2D, 3D, and 4D Filtering with CUDA 489
Megavoxels/Second
0
150
100
50
200
250
300
350
400
450
0 20406080100120
Volumes of Size 128 × 128 × 64
Shared
Shared unrolled
FFT
Megavoxels/Second
0
150
100
50
200
250
300
350
400
450
0 20 40 60 80 100 120 140
Volumes of Size 128 × 128 × 64
Shared
Shared unrolled
FFT
Figure 5.11. Performance, measured in megavoxels per second, for the different im-
plementations of 4D filtering and data sizes ranging from 128 × 128 × 64 × 16 to
128 × 128 × 64 × 128. The FFT-based approach clearly outperforms the convolution
approaches. The results for a 7 × 7 × 7 × 7 filter are shown in the upper plot and the
results for a 11 × 11 × 11 × 11 ...