March 2020
Intermediate to advanced
366 pages
9h 8m
English
Usually, a CNN has millions of parameters. Let's make an estimation to find out where all of those parameters come from.
Suppose we have a 10-layer network and each layer has 100 filters of size 3 x 3. These numbers are quite low and networks that have good performance usually have dozens of layers and hundreds of filters in each layer. For our case, each filter has a depth of 100.
Hence, each filter has 3 x 3 x 3 = 900 parameters (excluding biases, the number of which is 100), which results in 900 x 100 parameters for each layer and, therefore, about 900,000 parameters for the complete network. To learn so many parameters from scratch without overfitting would require quite a large annotated dataset. A question ...