Eduardo D’Azevedo*; Ki Sing Chan†; Shi-Quan Su‡; Kwai Wong‡* Oak Ridge National Laboratory, United States† Chinese University of Hong Kong, Hong Kong‡ University of Tennessee, United States
This chapters documents the implementation of a parallel distributed memory out-of-core (OOC) solver for performing LU and Cholesky factorizations of a large dense matrix on clusters equipped with Intel® Xeon Phi™ coprocessors. The OOC solver takes advantage of NVIDIA graphics processing units (GPU) or Intel Xeon Phi coprocessor (MIC) and allows problems larger than device memory to be solved. The OOC solver is built to be compatible with the format of the ScaLAPACK software library, making ...
Get High Performance Parallelism Pearls Volume One now with O’Reilly online learning.
O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.