December 2018
Beginner to intermediate
684 pages
21h 9m
English
To do this, we use the same DDQN agent and neural network architecture that successfully learned to navigate the Lunar Lander environment. We let exploration continue for 500,000 time steps (~2,000 1yr trading periods) with linear decay of ε to 0.1 and exponential decay at a factor of 0.9999 thereafter.
We can instantiate the environment by using the desired trading costs and ticker:
trading_environment = gym.make('trading-v0')trading_environment.env.trading_cost_bps = 1e-3trading_environment.env.time_cost_bps = 1e-4trading_environment.env.ticker = 'AAPL'trading_environment.seed(42)
The following diagram shows the rolling average of agent and market returns over 100 periods on the ...