Skip to Content
Ray 分布式机器学习:利用Ray 进行大模型的数据处理、训练、推理和部署
book

Ray 分布式机器学习:利用Ray 进行大模型的数据处理、训练、推理和部署

by Max Pumperla, Edward Oakes, Richard Liaw
May 2024
Intermediate
252 pages
5h 31m
Chinese
China Machine Press
Content preview from Ray 分布式机器学习:利用Ray 进行大模型的数据处理、训练、推理和部署
Ray AIR
入门
|
205
在继续
AIR
工作流的训练步骤之前,我们看一下可用的不同类型的
AIR
预处理
器(见表
10.1
。如果你想了解关于所有可用预处理器的更多信息,可以查阅用
户指南(
https://oreil.ly/WcV6W
。在本书中,我们仅使用预处理器进行特征缩
放,但其他类型的
Ray AIR
预处理器也非常有用。
10.1Ray AIR 预处理器
预处理器类型 示例
特征缩放 MaxAbsScaler
MinMaxScaler
Normalizer
PowerTransformer
StandardScaler
通用预处理器 BatchMapper
Chain
Concatenator
SimpleImputer
类型编码器 Categorizer
LabelEncoder
OneHotEncoder
文本编码器 Tokenizer
FeatureHasher
10.2.2
训练器
准备好训练数据集和测试数据集并定义好预处理器后,就可以指定
Trainer
(训
练器),在数据上运行机器学习算法。
Ray Train
包中的训练器在第
7
章介绍过,
它们为
TensorFlow
PyTorch
XGBoost
等训练框架提供了一致的封装。在本
示例中,我们将重点介绍
XGBoost
,不过对于
Ray AIR API
,所有其他框架集成
的工作方式完全相同。
我们定义一个
XGBoostTrainer
,它是
Ray AIR
附带的许多特定训练器实现之
一。定义这样一个训练器需要指定以下参数:
AIR
ScalingConfig ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

通过可观测性确保数据与AI的可靠性

通过可观测性确保数据与AI的可靠性

Barr Moses, Michael Segner

Publisher Resources

ISBN: 9787111753384