Chapter 2

Heterogeneous data parallel computing

With special contribution from David Luebke

Abstract

This chapter introduces the concept of data parallelism and the essential CUDA C features for writing a simple CUDA C program. It starts with the concept of threads, host, and device. It introduces CUDA device memory management and data transfer applications programming interface functions. It further introduces the basic structure of a CUDA C kernel function, built-in variables, function declaration keywords, and kernel launch syntax. It concludes with a brief overview of the compilation process for CUDA C programs.

Keywords

Data parallelism; scalable parallel program; CUDA C; thread, kernel; applications programming interface (API); host code; ...

Get Programming Massively Parallel Processors, 4th Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.