Chapter 5. Data Parallel Programming with Repa
The techniques we’ve seen in the previous chapters are great for
parallelizing code that uses ordinary Haskell data structures like
lists and Maps, but they don’t work as well for data-parallel
algorithms over large arrays. That’s because large-scale array
computations demand very good raw sequential performance, which we can
get only by operating on arrays of unboxed data. We can’t use
Strategies to parallelize operations over unboxed arrays, because they
need lazy data structures (boxed arrays would be suitable, but not
unboxed arrays). Similarly, Par doesn’t work well here either,
because in Par the data is passed in IVars.
In this chapter, we’re going to see how to write efficient numerical array computations in Haskell and run them in parallel. The library we’re going to use is called Repa, which stands for REgular PArallel arrays.[19] The library provides a range of efficient operations for creating arrays and operating on arrays in parallel.
The Repa package is available on Hackage. If you followed the
instructions for installing the sample code dependencies earlier, then
you should already have it, but if not you can install it with cabal install:
$ cabal install repa
In this chapter, I’m going to use GHCi a lot to illustrate the behavior
of Repa; trying things out in GHCi is a great way to become familiar
with the types and operations that Repa provides. Because Repa
provides many operations with the same names as Prelude functions ...