Chapter 3. From Loops to Grids

We are now ready to apply the basics presented in Chapter 2, “CUDA Essentials,” to parallelize the C codes presented in Chapter 1, “First Steps,” and Appendix C, “Need-to-Know C Programming.” Recall that we created two versions of the distance app. In dist_v1 we used a for loop to compute an array of distance values. In dist_v2 we created an array of input values and then called the function distanceArray() to compute the entire array of distance values (again in serial using a for loop). In this chapter, we use CUDA to parallelize the distance apps by replacing serial passes through a loop with a computational grid of threads that can execute together.

We will be creating, building, and executing new apps, but ...

Get CUDA for Engineers: An Introduction to High-Performance Parallel Computing now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.