August 2019
Intermediate to advanced
242 pages
5h 45m
English
How does this difference manifest in terms of the actual design of the processor itself? This diagram, taken from NVIDIA's own CUDA documentation, illustrates these differences:

Control or cache units are reduced, while there is a significant increase in the number of cores or ALUs. This results in improvement of an order of magnitude (or more) in performance. The caveat to this is that GPU efficiency is far from perfect with respect to memory, compute, and power. This is why a number of companies are racing to design a processor for DNN workloads from the ground up, to optimize the ratio of cache units/ALUs, ...
Read now
Unlock full access