Skip to Main Content
Designing Scientific Applications on GPUs
book

Designing Scientific Applications on GPUs

by Raphael Couturier
November 2013
Intermediate to advanced content levelIntermediate to advanced
498 pages
17h 6m
English
Chapman and Hall/CRC
Content preview from Designing Scientific Applications on GPUs
Solving large sparse linear systems for integer factorization on GPUs 463
Sliced COO subformats Small Medium Large
Memory sharing No sharing Among warp Among block
Access method Direct Atomic XOR Atomic XOR
Bank conflict No No Yes
# Rows per Slice 12 192 6144
TABLE 20.2. Sliced COO subformat comparison (# rows per slices is based
on n = 64).
is no bank accessed by more than one thread. Thus, there is no bank conflict.
A p-reduction operation on shared memory is required to combine partial
results from each thread.
The maximum number of rows per slice is calculated as size of shared
memory per SM in bits / (number of threads per block * blocking factor).
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Introduction to Numerical Analysis and Scientific Computing

Introduction to Numerical Analysis and Scientific Computing

Nabil Nassif, Dolly Khuwayri Fayyad
Computational Electromagnetism

Computational Electromagnetism

Alain Bossavit, Isaak D. Mayergoyz

Publisher Resources

ISBN: 9781466571648