January 2018
Intermediate to advanced
470 pages
11h 9m
English
Now that we can select the state with the most awards given for the best policy recursively (see the nextState method in the following code), an online training method for the Q-learning algorithm can be performed for options trading, for example.
So, once the Q-learning model is trained using the supplied data, the next state can be predicted using the Q-learning model by overriding the data transformation method (PipeOperator, that is, |) with a transformation of a state to a predicted goal state:
override def |> : PartialFunction[QLState[T], Try[QLState[T]]] = { case st: QLState[T] if isModel => Try( if (st.isGoal) st else nextState(QLIndexedState[T](st, 0)).state) }
I guess that's enough of ...
Read now
Unlock full access