Skip to Content
深度學習|內行人的做法
book

深度學習|內行人的做法

by Josh Patterson, Adam Gibson
January 2019
Beginner to intermediate
576 pages
14h 31m
Chinese
GoTop Information, Inc.
Content preview from 深度學習|內行人的做法
Graph 圖、可視化與平均 Q
|
471
B-8 webapp-rl4j 的螢幕截圖
472
|
附錄 BRL4J 與強化學習
最重要的就是要追蹤累積獎勵,如 B-9 所示。這是檢查代理者是否以有效方式變得更
好的其中一種方法。很重要一定要注意的是,這裡呈現的是 ε 貪婪策略,而不是從 Q
似直接導出的策略。
B-9 累積獎勵圖
你也可能需要追蹤損失(神經網路的分數)與平均 Q 值,如 B-10 所示。
RL4J
|
473
B-10 分數與平均 Q 值圖
與經典的監督式學習不同的是,損失並不一定總是持續減少,因為學習會對標籤造成影
響!
如果與目標網路一起使用,你應該會看到來自不同目標網路非連續評估值的一些不連續
性。損失應該會相對於單一目標網路往下減少。平均 Q 值應該會平穩地收斂到一個與平
均期望獎勵成正比的值。
RL4J
RL4J 可以在 GitHub
https://github.com/deeplearning4j/rl4j
)上直接取得。目前已實作的
部分包括帶有經驗重播的
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

高效能網站建置指南

高效能網站建置指南

Steve Souders
初探深度學習|使用TensorFlow

初探深度學習|使用TensorFlow

Reza Zadeh, Bharath Ramsundar
深度学习实战

深度学习实战

Douwe Osinga

Publisher Resources

ISBN: 9789865020262