book

핸즈온 머신러닝(3판)

Name: 핸즈온 머신러닝(3판)
Author: 오렐리앙 제롱(Aurélien Géron)
ISBN: 9791169211475

by 오렐리앙 제롱(Aurélien Géron), 박해선

September 2023

Beginner to intermediate

1044 pages

27h 52m

Korean

Hanbit Media, Inc.

Read now

Unlock full access

Content preview from 핸즈온 머신러닝(3판)

817

장

강화 학습

18.5

행동 평가: 신용 할당 문제

각 스텝에서 가장 좋은 행동이 무엇인지 알고 있다면 평소처럼 추정된 확률과 타깃 확률 사이

의 크로스 엔트로피를 최소화하도록 신경망을 훈련할 수 있습니다. 이는 일반적인 지도 학습

과 같습니다. 하지만 강화 학습에서 에이전트가 얻을 수 있는 가이드는 보상뿐입니다. 보상은

일반적으로 드물고 지연되어 나타납니다. 예를 들어 에이전트가

100

스텝 동안 막대의 균형을

유지했다면 이

100

번의 행동 중 어떤 것이 좋고, 어떤 것이 나쁜지 알 수 있을까요? 우리가 아

는 것은 마지막 행동 뒤에 막대가 쓰러졌다는 것뿐입니다. 하지만 모든 책임이 이 마지막 행동

에 있는 것은 당연히 아닙니다. 이를

신용 할당 문제

credit

assignment

problem

라고 합니다. 즉, 에이전

트가 보상을 받았을 때 어떤 행동 덕분인지 (혹은 탓인지) 알기 어렵습니다. 주인의 말을 잘 따

르고 몇 시간이 지나서 보상을 받은 강아지를 생각해보세요. 이 강아지는 무엇 때문에 보상을

받았는지 이해할 수 있을까요?

이 문제를 해결하기 위해 흔히 사용하는 전략은 행동이 일어난 후 각 단계마다

할인 계수

discount

factor

(감마 )를 적용한 보상을 모두 합하여 행동을 평가하는 것입니다. 할인된 보상의 합을 행

동의

대가 ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

O’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.

Julian F.

Head of Cybersecurity

I wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.

Addison B.

Field Engineer

I’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.

Amir M.

Data Platform Tech Lead

I'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.

Mark W.

Embedded Software Engineer

Publisher Resources

ISBN: 9791169211475

Cloud Computing

Data Engineering

Data Science

AI & ML

Programming Languages

Software Architecture

IT/Ops

Security

Design

Business

Soft Skills

핸즈온 머신러닝(3판)

by 오렐리앙 제롱(Aurélien Géron), 박해선

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

More than 5,000 organizations count on O’Reilly

Julian F.

Addison B.

Amir M.

Mark W.

You might also like

밑바닥부터 시작하는 비트코인

러닝 타입스크립트

밑바닥부터 시작하는 딥러닝

견고한 데이터 엔지니어링

Publisher Resources