Following are some general guidelines for fine-tuning:
- Replace the last fully connected layer. A common practice of fine tuning is to replace the last one or two fully connected layers (softmax) with a new structure of layers (softmax) that are specific to your own problem. For example, in AlexNet and ConvNet, the output has 1000 classes. One can replace the last layer, which has 1000 nodes, with the the number of desired classes.
- Use a smaller learning rate. As most parameters of the network have been trained and we are assuming these layers contain useful, universal pattern information from the data, the goal is not to over-tune these layers but rather train the layers—for example, the new, last few layers to accommodate ...