How to be an adaptive thinker
Reinforcement learning, one of the foundations of machine learning, supposes learning through trial and error by interacting with an environment. This sounds familiar, right? That is what we humans do all our lives—in pain! Try things, evaluate, and then continue; or try something else.
In real life, you are the agent of your thought process. In a machine learning model, the agent is the function calculating through this trial-and-error process. This thought process in machine learning is the MDP. This form of action-value learning is sometimes called Q.
To master the outcomes of MDP in theory and practice, a three-dimensional method is a prerequisite.
The three-dimensional approach that will make you an artificial ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access