Deep Reinforcement Learning with Python: RLHF for Chatbots and Large Language Models

Index

Action

Action space

Action value functions

Activation function

Actor-critic (AC) approach

A2C

pseudocode

advantage

asynchronous advantage

implemeting A2C

Adaptability

9.a-ddpg.ipynb file

Advantage actor-critic (A2C)

actor

implementation

MC approach

pseudocode

Advantage functions

Agent

Agent Environment Cycle (AEC)

Agent learning network

Agent modeling

AlphaGo

branching factor

general approaches

MCTS

neural network training

policies

RL network

SL policy network

standard search tree

Antithetic sampling

Arcade Learning Environment (ALE)

Aroon Indicator

Artificial intelligence (AI)

definition

Generative AI

Artificial neural networks

Asynchronous advantage actor-critic (A3C)

Asynchronous Reinforcement Learning

Asynchronous version

Atari games

actions

breakout

frameskipping

observation ...

Get Deep Reinforcement Learning with Python: RLHF for Chatbots and Large Language Models now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Deep Reinforcement Learning with Python: RLHF for Chatbots and Large Language Models by Nimish Sanghi

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly