book

Reinforcement Learning for Finance

by Yves Hilpisch

October 2024

Intermediate to advanced

214 pages

5h 4m

English

O'Reilly Media, Inc.

Audio summary available

Read now

Unlock full access

Includes

Includes Quizzes

Target AudienceOverview of the BookAbout the Code in This BookConventions Used in This BookUsing Code ExamplesO’Reilly Online LearningHow to Contact UsAcknowledgments
Bayesian LearningTossing a Biased CoinRolling a Biased DieBayesian UpdatingReinforcement LearningMajor BreakthroughsMajor Building BlocksDeep Q-LearningConclusionsReferences
Decision ProblemsDynamic ProgrammingQ-LearningCartPole as an ExampleThe Game EnvironmentA Random AgentThe DQL AgentQ-Learning Versus Supervised LearningConclusionsReferences
Finance EnvironmentDQL AgentWhere the Analogy FailsLimited DataNo ImpactConclusionsReferences
Noisy Time Series DataSimulated Time Series DataConclusionsReferencesDQLAgent Python Class
Simple ExampleFinancial ExampleKolmogorov-Smirnov TestConclusionsReferences
Prediction Game RevisitedTrading EnvironmentTrading AgentConclusionsReferencesFinance EnvironmentDQLAgent ClassSimulation Environment

Delta HedgingHedging EnvironmentHedging AgentConclusionsReferencesBSM (1973) Formula
Two-Fund SeparationTwo-Asset CaseThree-Asset CaseEqually Weighted PortfolioConclusionsReferencesThree-Asset Code
The ModelModel ImplementationExecution EnvironmentRandom AgentExecution AgentConclusionsReferences
References

Content preview from Reinforcement Learning for Finance

Chapter 10. Concluding Remarks

Time and uncertainty are the central elements that influence financial economic behavior. It is the complexity of their interaction that provides intellectual challenge and excitement to the study of finance. To analyze the effects of this interaction properly often requires sophisticated analytical tools.

Merton (1990)

Reinforcement learning (RL) has undoubtedly become a central and important algorithm and approach in machine learning (ML) and AI in general. There are many different flavors of the basic algorithmic idea, an overview of which can be found in Sutton and Barto (2018). This book primarily focuses on deep Q-learning (DQL). The fundamental idea of DQL is that the agent learns an optimal action policy that assigns a value to each feasible state-action combination. The higher the value, the better an action given a certain state. The book also provides in Chapter 9 an example of a simple actor-critic algorithm. In this case, the agent has the optimal action policy separated from the value function. At the core of these algorithms are deep neural networks (DNNs) that are used to approximate optimal action policies and, in the case of actor-critic algorithms, also value functions.

Part I introduces the basics of DQL and provides first, simple applications. Finance as a domain is characterized by limited data availability. A historical time series, say, for the price of a share of a stock, is at a certain point in time given and fixed. This ...