Chapter 3. From Loops to Grids
We are now ready to apply the basics presented in Chapter 2, “CUDA Essentials,” to parallelize the C codes presented in Chapter 1, “First Steps,” and Appendix C, “Need-to-Know C Programming.” Recall that we created two versions of the distance app. In dist_v1
we used a for
loop to compute an array of distance values. In dist_v2
we created an array of input values and then called the function distanceArray()
to compute the entire array of distance values (again in serial using a for
loop). In this chapter, we use CUDA to parallelize the distance apps by replacing serial passes through a loop with a computational grid of threads that can execute together.
We will be creating, building, and executing new apps, but ...
Get CUDA for Engineers: An Introduction to High-Performance Parallel Computing now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.