book

CUDA Fortran for Scientists and Engineers, 2nd Edition

Name: CUDA Fortran for Scientists and Engineers, 2nd Edition
ISBN: 9780443219764

by Gregory Ruetsch, Massimiliano Fatica

July 2024

Intermediate to advanced

436 pages

10h 46m

English

Morgan Kaufmann

Read now

Unlock full access

Cover image
Title page
Table of Contents
Copyright
Dedication
Preface to the Second Edition
Preface to the First Edition
References
Acknowledgments
Part 1: CUDA Fortran programming
Chapter 1: Introduction
Abstract1.1. A brief history of GPU computing1.2. Parallel computation1.3. Basic concepts1.4. Determining CUDA hardware features and limits1.5. Error handling1.6. Compiling CUDA Fortran code1.7. CUDA Driver, Toolkit, and compatibility

Chapter 2: Correctness, accuracy, and debugging
Abstract2.1. Assessing correctness of results2.2. Debugging
Chapter 3: Performance measurement and metrics
Abstract3.1. Measuring execution time3.2. Instruction, bandwidth, and latency bound kernels3.3. Memory bandwidth
Chapter 4: Synchronization
Abstract4.1. Synchronization of kernel execution and data transfers4.2. Synchronization of kernel threads on the device
Chapter 5: Optimization
Abstract5.1. Transfers between host and device5.2. Device memory5.3. Execution configuration5.4. Instruction optimization
Chapter 6: Porting tips and techniques
Abstract6.1. CUF kernels6.2. Conditional inclusion of code6.3. Renaming variables6.4. Minimizing memory footprint for work arrays6.5. Array compactionReferences
Chapter 7: Interfacing with CUDA C code and CUDA libraries
Abstract7.1. Calling user-written CUDA C code7.2. cuBLAS7.3. cuSPARSE7.4. cuSOLVER7.5. cuTENSOR7.6. Thrust
Chapter 8: Multi-GPU programming
Abstract8.1. CUDA multi-GPU features8.2. Multi-GPU programming with MPIReferences
Part 2: Case studies
Chapter 9: Monte Carlo method
Abstract9.1. CURAND9.2. Computing π with CUF kernels9.3. Computing π with reduction kernels9.4. Accuracy of summation9.5. Option pricingReferences
Chapter 10: Finite difference method
Abstract10.1. Nine-point 1D finite difference stencil10.2. 2D Laplace equationReferences
Chapter 11: Applications of the fast Fourier transform
Abstract11.1. CUFFT11.2. Spectral derivatives11.3. Convolution11.4. Poisson solverReferences
Chapter 12: Ray tracing
Abstract12.1. Generating an image file12.2. Vectors in CUDA Fortran12.3. Rays, a simple camera, and background12.4. Adding a sphere12.5. Surface normals and multiple objects12.6. Antialiasing12.7. Material types12.8. Positionable camera12.9. Defocus blur12.10. Where next?12.11. Triangles12.12. Lights12.13. TexturesReferences
Part 3: Appendices
Appendix A: System and environment management
A.1. Environment variablesA.2. nvidia-smi – System Management Interface
References
References
Index

Content preview from CUDA Fortran for Scientists and Engineers, 2nd Edition

Chapter 8: Multi-GPU programming

Abstract

This chapter covers how to write code that utilizes multiple GPUs. There are many possible configurations between host processes and devices one can use in multi-GPU code. In this chapter, we focus on two configurations: (1) a single host process with multiple GPUs using CUDA's peer-to-peer capabilities and (2) using MPI where each MPI process uses a separate GPU. As examples of these approaches, we implement peer-to-peer and MPI multi-GPU versions of the transpose example used in the previous chapter.

Keywords

Peer-to-peer; UVA (Unified Virtual Addressing); Direct transfer; Direct access; MPI (Message Passing Interface); Compute mode; nvidia-smi (NVIDIA System Management Interface)

There are many configurations ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

O’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.

Julian F.

Head of Cybersecurity

I wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.

Addison B.

Field Engineer

I’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.

Amir M.

Data Platform Tech Lead

I'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.

Mark W.

Embedded Software Engineer

Publisher Resources

ISBN: 9780443219764

Cloud Computing

Data Engineering

Data Science

AI & ML

Programming Languages

Software Architecture

IT/Ops

Security

Design