Accelerating CULA Linear Algebra Routines with Hybrid GPU and Multicore Computing
John Humphrey, Daniel Price, Kyle Spagnoli and Eric Kelmelis
The LU decomposition is a popular linear algebra technique with applications such as the solution of systems of linear equations and calculation of matrix inverses and determinants. Central processing unit (CPU) versions of this routine exhibit very high performance, making the port to a graphics processing unit (GPU) a challenging prospect. This chapter discusses the implementation of LU decomposition in our CULA library for linear algebra on the GPU, describing the steps necessary for achieving significant speed-ups over the CPU.
The modern GPU ...