Chapter 12

Accelerating CULA Linear Algebra Routines with Hybrid GPU and Multicore Computing

John Humphrey, Daniel Price, Kyle Spagnoli and Eric Kelmelis

The LU decomposition is a popular linear algebra technique with applications such as the solution of systems of linear equations and calculation of matrix inverses and determinants. Central processing unit (CPU) versions of this routine exhibit very high performance, making the port to a graphics processing unit (GPU) a challenging prospect. This chapter discusses the implementation of LU decomposition in our CULA library for linear algebra on the GPU, describing the steps necessary for achieving significant speed-ups over the CPU.

12.1 Introduction, Problem Statement, and Context

The modern GPU ...

Get GPU Computing Gems Jade Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.