Heterogeneous Computing with OpenCL 2.0

Book description

Heterogeneous Computing with OpenCL 2.0 teaches OpenCL and parallel programming for complex systems that may include a variety of device architectures: multi-core CPUs, GPUs, and fully-integrated Accelerated Processing Units (APUs). This fully-revised edition includes the latest enhancements in OpenCL 2.0 including:

• Shared virtual memory to increase programming flexibility and reduce data transfers that consume resources • Dynamic parallelism which reduces processor load and avoids bottlenecks • Improved imaging support and integration with OpenGL

Designed to work on multiple platforms, OpenCL will help you more effectively program for a heterogeneous future. Written by leaders in the parallel computing and OpenCL communities, this book explores memory spaces, optimization techniques, extensions, debugging and profiling. Multiple case studies and examples illustrate high-performance algorithms, distributing work across heterogeneous systems, embedded domain-specific languages, and will give you hands-on OpenCL experience to address a range of fundamental parallel algorithms.

  • Updated content to cover the latest developments in OpenCL 2.0, including improvements in memory handling, parallelism, and imaging support
  • Explanations of principles and strategies to learn parallel programming with OpenCL, from understanding the abstraction models to thoroughly testing and debugging complete applications
  • Example code covering image analytics, web plugins, particle simulations, video editing, performance optimization, and more

Table of contents

  1. Cover image
  2. Title page
  3. Table of Contents
  4. Copyright
  5. List of Figures
  6. List of Tables
  7. Foreword
  8. Acknowledgments
  9. Chapter 1: Introduction
    1. Abstract
    2. 1.1 Introduction to Heterogeneous Computing
    3. 1.2 The Goals of This Book
    4. 1.3 Thinking Parallel
    5. 1.4 Concurrency and Parallel Programming Models
    6. 1.5 Threads and Shared Memory
    7. 1.6 Message-Passing Communication
    8. 1.7 Different Grains of Parallelism
    9. 1.8 Heterogeneous Computing with OpenCL
    10. 1.9 Book Structure
  10. Chapter 2: Device architectures
    1. Abstract
    2. 2.1 Introduction
    3. 2.2 Hardware Trade-offs
    4. 2.3 The Architectural Design Space
    5. 2.4 Summary
  11. Chapter 3: Introduction to OpenCL
    1. Abstract
    2. 3.1 Introduction
    3. 3.2 The OpenCL Platform Model
    4. 3.3 The OpenCL Execution Model
    5. 3.4 Kernels and the OpenCL Programming Model
    6. 3.5 OpenCL Memory Model
    7. 3.6 The OpenCL Runtime with an Example
    8. 3.7 Vector Addition Using an OpenCL C++ Wrapper
    9. 3.8 OpenCL for CUDA Programmers
    10. 3.9 Summary
  12. Chapter 4: Examples
    1. Abstract
    2. 4.1 OpenCL Examples
    3. 4.2 Histogram
    4. 4.3 Image Rotation
    5. 4.4 Image Convolution
    6. 4.5 Producer-Consumer
    7. 4.6 Utility Functions
    8. 4.7 Summary
  13. Chapter 5: OpenCL runtime and concurrency model
    1. Abstract
    2. 5.1 Commands and the Queuing Model
    3. 5.2 Multiple Command-Queues
    4. 5.3 The Kernel Execution Domain: Work-Items, Work-Groups, and NDRanges
    5. 5.4 Native and Built-In Kernels
    6. 5.5 Device-Side Queuing
    7. 5.6 Summary
  14. Chapter 6: OpenCL host-side memory model
    1. Abstract
    2. 6.1 Memory Objects
    3. 6.2 Memory Management
    4. 6.3 Shared Virtual Memory
    5. 6.4 Summary
  15. Chapter 7: OpenCL device-side memory model
    1. Abstract
    2. 7.1 Synchronization and Communication
    3. 7.2 Global Memory
    4. 7.3 Constant Memory
    5. 7.4 Local Memory
    6. 7.5 Private Memory
    7. 7.6 Generic Address Space
    8. 7.7 Memory Ordering
    9. 7.8 Summary
  16. Chapter 8: Dissecting OpenCL on a heterogeneous system
    1. Abstract
    2. 8.1 OpenCL on an AMD FX-8350 CPU
    3. 8.2 OpenCL on the AMD Radeon R9 290X GPU
    4. 8.3 Memory Performance Considerations in OpenCL
    5. 8.4 Summary
  17. Chapter 9: Case study: Image clustering
    1. Abstract
    2. 9.1 Introduction
    3. 9.2 The Feature Histogram on the CPU
    4. 9.3 OpenCL Implementation
    5. 9.4 Performance Analysis
    6. 9.5 Conclusion
  18. Chapter 10: OpenCL profiling and debugging
    1. Abstract
    2. 10.1 Introduction
    3. 10.2 Profiling OpenCL Code Using Events
    4. 10.3 AMD CodeXL
    5. 10.4 Profiling Using CodeXL
    6. 10.5 Analyzing Kernels Using CodeXL
    7. 10.6 Debugging OpenCL Kernels Using CodeXL
    8. 10.7 Debugging Using printf
    9. 10.8 Summary
  19. Chapter 11: Mapping high-level programming languages to OpenCL 2.0: A compiler writer’s perspective
    1. Abstract
    2. 11.1 Introduction
    3. 11.2 A Brief Introduction to C++ AMP
    4. 11.3 OpenCL 2.0 as a Compiler Target
    5. 11.4 Mapping Key C++ AMP Constructs to OpenCL
    6. 11.5 C++ AMP Compilation Flow
    7. 11.6 Compiled C++ AMP Code
    8. 11.7 How Shared Virtual Memory in OpenCL 2.0 Fits in
    9. 11.8 Compiler Support for Tiling in C++AMP
    10. 11.9 Address Space Deduction
    11. 11.10 Data Movement Optimization
    12. 11.11 Binomial Options: A Full Example
    13. 11.12 Preliminary Results
    14. 11.13 Conclusion
  20. Chapter 12: WebCL: Enabling OpenCL acceleration of Web applications
    1. Abstract
    2. 12.1 Introduction
    3. 12.2 Programming with WebCL
    4. 12.3 Synchronization
    5. 12.4 Interoperability with WebGL
    6. 12.5 Example Application
    7. 12.6 Security Enhancement
    8. 12.7 WebCL on the Server
    9. 12.8 Status and Future of WebCL
    10. Works Cited
  21. Chapter 13: Foreign lands: Plugging OpenCL in
    1. Abstract
    2. 13.1 Introduction
    3. 13.2 Beyond C and C+ +
    4. 13.3 Haskell OpenCL
    5. 13.4 Summary
  22. Index

Product information

  • Title: Heterogeneous Computing with OpenCL 2.0
  • Author(s): David R. Kaeli, Perhaad Mistry, Dana Schaa, Dong Ping Zhang
  • Release date: June 2015
  • Publisher(s): Morgan Kaufmann
  • ISBN: 9780128016497