Skip to Content
精通機器學習
book

精通機器學習

by Aurélien Géron
April 2020
Intermediate to advanced
816 pages
18h 32m
Chinese
GoTop Information, Inc.
Content preview from 精通機器學習
實作 Deep Q-Learning
|
609
公式
18-7 
目標
Q
有了這個目標
Q
值之後
我們就可以用任何一種梯度下降演算法來執行訓練步驟了
們通常會試著盡量減少估計
Q
Q(s, a)
與目標
Q
值之間的誤差
Huber loss
以降低
演算法對大型誤差的敏感度
的平方
以上就是基本
Deep Q-Learning
演算法
接下來我
們要瞭解如何實作它
來處理
CartPole
環境
實作
Deep Q-Learning
首先
我們需要一個
Deep Q-Network
理論上
你需要一個接收一對狀態
/
行動
並輸出
一個近似
Q
值的神經網路
但是在實務上更有效率的做法是使用神經網路
讓它接收一
個狀態
並為各個可能的行動輸出一個近似
Q
處理
CartPole
環境不需要很複雜的神
經網路
你只要用幾個隱藏層就夠了
env = gym.make("CartPole-v0")
input_shape = [4] # == env.observation_space.shape
n_outputs = 2 # == env.action_space.n
model = keras.models.Sequential([
keras.layers.Dense(32, activation="elu", input_shape=input_shape),
keras.layers.Dense(32, activation="elu"),
keras.layers.Dense(n_outputs)
])
用這個
DQN
選擇行動的做法是選出有最大預測 ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

下一代空间计算:AR与VR创新理论与实践

下一代空间计算:AR与VR创新理论与实践

Erin Pangilinan, Steve Lukas, Vasanth Mohan
C语言核心技术(原书第2版)

C语言核心技术(原书第2版)

Peter Prinz, Tony Crawford

Publisher Resources

ISBN: 9789865024345