In the example used in the last section, we tried to train our network with one hidden layer of size 100. Let's play around with that and see what happens.
First we will increase the size of the hidden layer from 100 to 200:
nn = MLPClassifier(hidden_layer_sizes=(200), max_iter=20, solver='sgd', learning_rate_init=0.001, verbose=True)
The network performance is as follows:
Network Performance: 0.816800
We see that there is no significant improvement in the result. Now, let's try with increasing the number of hidden layers. We will train our network with three hidden layers of size 100 each:
nn = MLPClassifier(hidden_layer_sizes=(100, 100, 100), max_iter=20, solver='sgd', learning_rate_init=0.001, verbose=True)