Reducing precision
Another simple approach to reduce the size of the network is to directly convert weights from double/float data type to another with lower memory size, or to a fixed precision. This (almost) doesn't affect quality of predictions, but allows to reduce the size of the model up to four times.
Reducing precision of the network is exclusively focused only after the training is complete. Previously, attempts to train a network with lower precision data types have been experimented, and the results indicated difficulties in handling the back propagation and gradients.
Once the network is trained, we could right away replace double by float, or even better, by fixed precision. For example, in the trained neural network, you have ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access