Skip to Content
Ray 分布式机器学习:利用Ray 进行大模型的数据处理、训练、推理和部署
book

Ray 分布式机器学习:利用Ray 进行大模型的数据处理、训练、推理和部署

by Max Pumperla, Edward Oakes, Richard Liaw
May 2024
Intermediate
252 pages
5h 31m
Chinese
China Machine Press
Content preview from Ray 分布式机器学习:利用Ray 进行大模型的数据处理、训练、推理和部署
利用
Ray Train
进行分布式训练
|
145
7.2
基于示例介绍
Ray Train
Ray Train
是用于在
Ray
上进行分布式数据并行训练的库。它提供了训练工作流
中不同环节的关键工具,包括特征处理、可扩展训练,集成了机器学习跟踪工
具和模型导出机制。
在基本的机器学习训练管道中,你将使用
Ray Train
的以下关键组件:
训练器
Ray Train
拥有多个训练器类用于分布式训练。训练器是对第三方训练框架
(如
XGBoost
Pytorch
TensorFlow
)的包装类,提供与核心
Ray
执行器
(用于分布式)
Ray Tune
Ray Dataset
的集成。
预测器
训练好模型后,你就可以使用模型进行预测。对于输入数据的批次,你可以
使用批量预测器,批量预测器还能评估模型在验证集上的性能。
此外,
Ray Train
提供了常见的预处理器对象和工具,用于将数据集对象处理成
可供训练器使用的特征。最后,
Ray Train
提供了一个检查点(
Checkpoint
)类,
支持保存和恢复训练运行的状态。在第一个示例中,我们不会使用任何预处理
器,而是放在后面的示例进行介绍。
Ray Tr ain
对大型数据集支持得非常好。按照相同的理念,用户无须考虑如何并
行化代码,只需将大型数据集与
Ray Train
“连接”起来,不用考虑如何将数据
输入不同的并行
worker
我们通过第一个
Ray Train
示例综合运用这些组件。为了加载训练数据,我们将
利用第
6
章的知识,并大量使用
Ray Dataset
7.2.1
预测纽约出租车的大额小费 ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

通过可观测性确保数据与AI的可靠性

通过可观测性确保数据与AI的可靠性

Barr Moses, Michael Segner

Publisher Resources

ISBN: 9787111753384