March 2018
Intermediate to advanced
272 pages
7h 53m
English
I'm going to use a first hidden layer with 512 neurons. That's slightly smaller than the input vector's 784 elements, but that's not at all a rule. Again, this architecture is just a start and isn't necessarily best. I'll then walk down the size through the second and third hidden layers, as shown in the following code:
x = Dense(512, activation='relu', name="hidden1")(inputs)x = Dense(256, activation='relu', name="hidden2")(x)x = Dense(128, activation='relu', name="hidden3")(x)