Chapter 10Implementation Considerations
What's in this chapter?
- Understanding the CUDA development process
- Discovering optimization opportunities using profiling tools
- Using the right metrics/events to determine most likely performance limiters
- Integrating NVTX to mark a critical section of code for profiling
- Using CUDA debugging tools to debug kernel and memory errors in CUDA
- Porting a real-world application from legacy C to CUDA C
Modern heterogeneous and parallel systems are not exclusively used for high-performance computing, but also apply to embedded development, mobile development, tablets, notebooks, PCs, and workstations. This ubiquity is causing a paradigm shift in general-purpose software development toward heterogeneous parallel programming as access to these systems becomes more common. Parallel programming has never been more convenient and beneficial, and so understanding how to efficiently and correctly implement parallel and heterogeneous software has never been more important.
This chapter covers the following aspects of CUDA C project development:
- The CUDA C development process
- Profile-driven optimization
- CUDA development tools
A case study is provided at the end of this chapter to demonstrate porting a legacy C application ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access