Skip to Content
Ray 分布式机器学习:利用Ray 进行大模型的数据处理、训练、推理和部署
book

Ray 分布式机器学习:利用Ray 进行大模型的数据处理、训练、推理和部署

by Max Pumperla, Edward Oakes, Richard Liaw
May 2024
Intermediate
252 pages
5h 31m
Chinese
China Machine Press
Content preview from Ray 分布式机器学习:利用Ray 进行大模型的数据处理、训练、推理和部署
利用
Ray RLlib
进行强化学习
|
91
述所有可用选项之后(如图
4.1
所示),我们将具体展示两个高级
RLlib
环境
示例。
单智能体 多智能体
常规
智能体
外部
智能体
MultiAgentExternalEnv
VectorEnv
BaseEnv
gym.Env
ExternalEnv
MultiAgentEnv
4.1:所有可用的 RLlib 环境
4.4.1 RLlib
环境概述
所有可用的
RLlib
环境都扩展自一个共同的
BaseEnv
类。如果你想在同一个
gym.Env
环境的多个副本上工作,可以使用
RLlib
VectorEnv
包装器。尽管
矢量化环境非常有用,但它其实是前面已经介绍过的环境的延伸。
RLlib
还提供
了另外两种类型的环境,更加值得关注。
第一个环境是
MultiAgentEnv
,它支持训练具有多个智能体的模型。多个智能
体一起工作可能会很棘手。这是因为你必须小心定义环境内的智能体,并考虑
到每个智能体可能以完全不同的方式与环境进行交互。
更重要的是,智能体之间可能存在交互,并且它们必须互不影响。在更高级的
设置中,甚至可能存在
明确依赖彼此的智能体层级。简而言之,运行多智能体
RL
实验是困难的,我们将在下个示例中看到
RLlib
如何处理该问题。
另一种类型的环境是
ExternalEnv
,它可用于将外部模拟器连接到
RLlib
。例
如,将前面的迷宫问题假设为机器人在迷宫中导航的模拟。在这种情况下,将
机器人(或不同技术栈中实现的模拟)与
RLlib
的智能体共同放在迷宫中可能不
合适。为此,
RLlib
提供了一种简单的客户 ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

通过可观测性确保数据与AI的可靠性

通过可观测性确保数据与AI的可靠性

Barr Moses, Michael Segner

Publisher Resources

ISBN: 9787111753384