book

Reinforcement Learning for Finance

by Yves Hilpisch

October 2024

Intermediate to advanced

214 pages

5h 4m

English

O'Reilly Media, Inc.

Audio summary available

Read now

Unlock full access

Includes

Includes Quizzes

Target AudienceOverview of the BookAbout the Code in This BookConventions Used in This BookUsing Code ExamplesO’Reilly Online LearningHow to Contact UsAcknowledgments
Bayesian LearningTossing a Biased CoinRolling a Biased DieBayesian UpdatingReinforcement LearningMajor BreakthroughsMajor Building BlocksDeep Q-LearningConclusionsReferences
Decision ProblemsDynamic ProgrammingQ-LearningCartPole as an ExampleThe Game EnvironmentA Random AgentThe DQL AgentQ-Learning Versus Supervised LearningConclusionsReferences
Finance EnvironmentDQL AgentWhere the Analogy FailsLimited DataNo ImpactConclusionsReferences
Noisy Time Series DataSimulated Time Series DataConclusionsReferencesDQLAgent Python Class
Simple ExampleFinancial ExampleKolmogorov-Smirnov TestConclusionsReferences
Prediction Game RevisitedTrading EnvironmentTrading AgentConclusionsReferencesFinance EnvironmentDQLAgent ClassSimulation Environment

Delta HedgingHedging EnvironmentHedging AgentConclusionsReferencesBSM (1973) Formula
Two-Fund SeparationTwo-Asset CaseThree-Asset CaseEqually Weighted PortfolioConclusionsReferencesThree-Asset Code
The ModelModel ImplementationExecution EnvironmentRandom AgentExecution AgentConclusionsReferences
References

Content preview from Reinforcement Learning for Finance

Chapter 2. Deep Q-Learning

Like a human, our agents learn for themselves to achieve successful strategies that lead to the greatest long-term rewards. This paradigm of learning by trial and error, solely from rewards or punishments, is known as reinforcement learning (RL).¹

DeepMind (2016)

The previous chapter introduces deep Q-learning (DQL) as a major algorithm in AI that learns through interaction with an environment. This chapter provides some more details about the DQL algorithm. It uses the CartPole environment from the Gymnasium Python package to illustrate the API-based interaction with gaming environments. It also implements a DQL agent as a self-contained Python class that serves as a blueprint for later DQL agents applied to financial environments.

However, before the focus is turned on DQL, the chapter discusses general decision problems in economics and finance. Dynamic programming is introduced as a solution mechanism for dynamic decision problems. This provides the background for the application of DQL algorithms because they can be considered to lead to approximate solutions to dynamic programming problems.

“Decision Problems” classifies decision problems in economics and finance according to different characteristics. “Dynamic Programming” focuses on a special type of decision problem: so-called finite horizon Markovian dynamic programming problems. “Q-Learning” outlines the major elements of Q-learning and explains the role of deep neural networks in this context. ...