
Linear programming on a GPU: a case study 235
E is merely the identity matrix having the `
th
column replaced by the vector
η. The update of the matrix B
−1
can be rewritten as
ˆ
B
−1
ij
= B
−1
ij
1 −
d
i
d
`
, ∀i 6= `,
ˆ
B
−1
`j
=
B
−1
`j
d
`
As shown in Listing 10.2, each block of the kernel processes a single column
while each thread may compute multiple elements of a column. This organi-
zation ensures that global memory accesses are coalescent since the matrix B
is stored column-wise.
Listing 10.2. basis update
extern s h a r e d v o l a t i l e double s da t a [ ] ;
g l o b a l void
u pd a t e B as i s K e r ne l ( in t m, u i n t l , double d l , double ∗B,
u i n t pi