Network construction, training, and saving the model

As discussed in the Titanic survival prediction section, again, everything starts with MultiLayerConfiguration, which organizes those layers and their hyperparameters. Our LSTM network consists of five layers. The input layer is followed by three LSTM layers. Then, the last layer is an RNN layer, which is also the output layer.

More technically, the first layer is the input layer, and then three layers are placed as LSTM layers. For the LSTM layers, we initialize the weights using Xavier, we use SGD as the optimization algorithm with the Adam updater, and we use Tanh as the activation function. Finally, the RNN output layer has a softmax activation function that gives us a probability distribution ...

Get Java Deep Learning Projects now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.