Single Conv Layer Computation
Let's first discuss what the conv layer computes intuitively. The Conv layer's parameters consist of a set of learnable filters (also called tensors). Each filter is small spatially (depth, width, and height), but extends through the full depth of the input volume (image). A filter on the first layer of a ConvNet typically has a size of 5 x 5 x 3 (that is, five pixels width and height, and three for depth, because images have three depths for color channels). During the forward pass, filters slide (or convolve) across the width and height of the input volume and compute the dot product between the entries of the filter and the input at any point. As the filter slides over the width and height of the input volume, ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access