Skip to Content
Mãos à Obra: Aprendizado de Máquina com Scikit-Learn e TensorFlow
book

Mãos à Obra: Aprendizado de Máquina com Scikit-Learn e TensorFlow

by Aurélien Géron
March 2019
Intermediate to advanced
576 pages
20h 51m
Portuguese (Portugal, Brazil)
Alta Books
Content preview from Mãos à Obra: Aprendizado de Máquina com Scikit-Learn e TensorFlow
Aprendizado de Diferenças Temporais e Q-Learning | 477
Agora, executaremos o algoritmo de Iteração Q-Value:
Q = np.full((3, 3), -np.inf) # -inf para ações impossíveis
for state, actions in enumerate(possible_actions):
Q[state, actions] = 0.0 # Valor inicial = 0.0, para todas as ações possíveis
discount_rate = 0.95
n_iterations = 100
for iteration in range(n_iterations):
Q_prev = Q.copy()
for s in range(3):
for a in possible_actions[s]:
Q[s, a] = np.sum([
T[s, a, sp] * (R[s, a, sp] + discount_rate * np.max(Q_prev[sp]))
for sp in range(3)
])
Os Q-Values resultantes se parecem com isso:
>>> Q
array([[ 21.89498982, 20.80024033, 16.86353093],
[ 1.11669335, ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Engenharia de IA

Engenharia de IA

Chip Huyen

Publisher Resources

ISBN: 9788550803814