
46 Designing Scientific Applications on GPUs
Listing 4.5. kernel 5×5 median filter processing 2 output pixel values per
thread by a combined forgetfull selection
g l o b a l void k e r n e l m e d i a n 5 2 p i x ( short ∗ output ,
in t i d im , in t j d im )
{
in t j= mu l 24 ( m ul 24 ( b l o c k I d x . x , blockDim . x ) + t h re a d I d x . x , 2 ) ;
5 int i= m ul 24 ( b l o c k I d x . y , blockDim . y ) + t h r e a d I dx . y ;
in t a0 , a1 , a2 , a3 , a4 , a5 , a6 , a7 , a8 , a9 , a10 , a11 , a12 , a13 ; // l e f t window
in t b7 , b8 , b9 , b10 , b11 , b12 , b13 ; // r igh t window
// f i r s t 14 common p ix e l s
a0 = tex2D ( t e x i m g i n