book

Parallel and Concurrent Programming in Haskell

Name: Parallel and Concurrent Programming in Haskell
Author: Simon Marlow
ISBN: 9781449335908

by Simon Marlow

July 2013

Intermediate to advanced

322 pages

8h 43m

English

O'Reilly Media, Inc.

Read now

Unlock full access

Parallel and Concurrent Programming in Haskell
Preface
AudienceHow to Read This BookConventions Used in This BookUsing Sample CodeSafari® Books OnlineHow to Contact UsAcknowledgments
1. Introduction
Terminology: Parallelism and ConcurrencyTools and ResourcesSample Code
I. Parallel Haskell
2. Basic Parallelism: The Eval Monad
Lazy Evaluation and Weak Head Normal FormThe Eval Monad, rpar, and rseqExample: Parallelizing a Sudoku SolverDeepseq
3. Evaluation Strategies
Parameterized StrategiesA Strategy for Evaluating a List in ParallelExample: The K-Means ProblemParallelizing K-MeansPerformance and AnalysisVisualizing Spark ActivityGranularityGC’d Sparks and Speculative ParallelismParallelizing Lazy Streams with parBufferChunking StrategiesThe Identity Property
4. Dataflow Parallelism: The Par Monad
Example: Shortest Paths in a GraphPipeline ParallelismRate-Limiting the ProducerLimitations of Pipeline ParallelismExample: A Conference TimetableAdding ParallelismExample: A Parallel Type InferencerUsing Different SchedulersThe Par Monad Compared to Strategies
5. Data Parallel Programming with Repa
Arrays, Shapes, and IndicesOperations on ArraysExample: Computing Shortest PathsParallelizing the ProgramFolding and Shape-PolymorphismExample: Image RotationSummary
6. GPU Programming with Accelerate
OverviewArrays and IndicesRunning a Simple Accelerate ComputationScalar ArraysIndexing ArraysCreating Arrays Inside AccZipping Two ArraysConstantsExample: Shortest PathsRunning on the GPUDebugging the CUDA BackendExample: A Mandelbrot Set Generator
II. Concurrent Haskell

7. Basic Concurrency: Threads and MVars
A Simple Example: RemindersCommunication: MVarsMVar as a Simple Channel: A Logging ServiceMVar as a Container for Shared StateMVar as a Building Block: Unbounded ChannelsFairness
8. Overlapping Input/Output
Exceptions in HaskellError Handling with AsyncMerging
9. Cancellation and Timeouts
Asynchronous ExceptionsMasking Asynchronous ExceptionsThe bracket OperationAsynchronous Exception Safety for ChannelsTimeoutsCatching Asynchronous Exceptionsmask and forkIOAsynchronous Exceptions: Discussion
10. Software Transactional Memory
Running Example: Managing WindowsBlockingBlocking Until Something ChangesMerging with STMAsync RevisitedImplementing Channels with STMMore Operations Are PossibleComposition of Blocking OperationsAsynchronous Exception SafetyAn Alternative Channel ImplementationBounded ChannelsWhat Can We Not Do with STM?PerformanceSummary
11. Higher-Level Concurrency Abstractions
Avoiding Thread LeakageSymmetric Concurrency CombinatorsTimeouts Using raceAdding a Functor InstanceSummary: The Async API
12. Concurrent Network Servers
A Trivial ServerExtending the Simple Server with StateDesign One: One Giant LockDesign Two: One Chan Per Server ThreadDesign Three: Use a Broadcast ChanDesign Four: Use STMThe ImplementationA Chat ServerArchitectureClient DataServer DataThe ServerSetting Up a New ClientRunning the ClientRecap
13. Parallel Programming Using Threads
How to Achieve Parallelism with ConcurrencyExample: Searching for FilesSequential VersionParallel VersionPerformance and ScalingLimiting the Number of Threads with a SemaphoreThe ParIO monad
14. Distributed Programming
The Distributed-Process Family of PackagesDistributed Concurrency or Parallelism?A First Example: PingsProcesses and the Process MonadDefining a Message TypeThe Ping Server ProcessThe Master ProcessThe main FunctionSumming Up the Ping ExampleMulti-Node PingRunning with Multiple Nodes on One MachineRunning on Multiple MachinesTyped ChannelsMerging ChannelsHandling FailureThe Philosophy of Distributed FailureA Distributed Chat ServerData TypesSending MessagesBroadcastingDistributionTesting the ServerFailure and Adding/Removing NodesExercise: A Distributed Key-Value Store
15. Debugging, Tuning, and Interfacing with Foreign Code
Debugging Concurrent ProgramsInspecting the Status of a ThreadEvent Logging and ThreadScopeDetecting DeadlockTuning Concurrent (and Parallel) ProgramsThread Creation and MVar OperationsShared Concurrent Data StructuresRTS Options to TweakConcurrency and the Foreign Function InterfaceThreads and Foreign Out-CallsAsynchronous Exceptions and Foreign CallsThreads and Foreign In-Calls
Index
About the Author
Colophon
Copyright

Content preview from Parallel and Concurrent Programming in Haskell

Chapter 6. GPU Programming with Accelerate

The most powerful processor in your computer may not be the CPU. Modern graphics processing units (GPUs) usually have something on the order of 10 to 100 times more raw compute power than the general-purpose CPU. However, the GPU is a very different beast from the CPU, and we can’t just run ordinary Haskell programs on it. A GPU consists of a large number of parallel processing units, each of which is much less powerful than one core of your CPU, so to unlock the power of a GPU we need a highly parallel workload. Furthermore, the processors of a GPU all run exactly the same code in lockstep, so they are suitable only for data-parallel tasks where the operations to perform on each data item are identical.

In recent years GPUs have become less graphics-specific and more suitable for performing general-purpose parallel processing tasks. However, GPUs are still programmed in a different way from the CPU because they have a different instruction set architecture. A special-purpose compiler is needed to compile code for the GPU, and the source code is normally written in a language that resembles a restricted subset of C. Two such languages are in widespread use: NVidia’s CUDA and OpenCL. These languages are very low-level and expose lots of details about the workings of the GPU, such as how and when to move data between the CPU’s memory and the GPU’s memory.

Clearly, we would like to be able to make use of the vast computing power of the GPU from ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

O’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.

Julian F.

Head of Cybersecurity

I wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.

Addison B.

Field Engineer

I’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.

Amir M.

Data Platform Tech Lead

I'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.

Mark W.

Embedded Software Engineer

Publisher Resources

ISBN: 9781449335939Errata Page

Cloud Computing

Data Engineering

Data Science

AI & ML

Programming Languages

Software Architecture

IT/Ops

Security

Design

Business

Soft Skills

Parallel and Concurrent Programming in Haskell

by Simon Marlow

Chapter 6. GPU Programming with Accelerate

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.