Skip to Content
Ray 分布式机器学习:利用Ray 进行大模型的数据处理、训练、推理和部署
book

Ray 分布式机器学习:利用Ray 进行大模型的数据处理、训练、推理和部署

by Max Pumperla, Edward Oakes, Richard Liaw
May 2024
Intermediate
252 pages
5h 31m
Chinese
China Machine Press
Content preview from Ray 分布式机器学习:利用Ray 进行大模型的数据处理、训练、推理和部署
Ray
生态及其他
|
223
train_dataset = train_dataset.map_batches(to_labeled_image)
test_dataset = test_dataset.map_batches(to_labeled_image)
return {
"image": np.array([image.numpy() for image, _ in batch]),
"label": np.array([label for _, label in batch]),
}
通过返回
image
label
NumPy
数组,对每个数据
batch
进行转换。
应用
map_batches
来转换初始数据集。
在进行模型训练之前,表
11.1
展示了
Ray Dataset
库支持的输入格式
2
1
11.1Ray Dataset 生态
集成 类型 说明
文本
二进制
像文件
CSV
JSON
基础数据
格式
Ray Dataset 支持加载和存储这些数据格式
但严格来
这些格式不属于集成
NumPy
Pandas
Arrow
Parquet
Python 对象
高级数据
格式
Ray Dataset 支持常见的 ML 数据库
例如 NumPy
Pandas
也支持读取自定义Python 对象或Parquet
文件
Spark
Dask
MARS
Modin
高级第三
方集成
Ray 通过社区集成的方式支持更多复杂的数据处理
系统
例如Spark on Ray
RayDP
)、
Dask on Ray ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

通过可观测性确保数据与AI的可靠性

通过可观测性确保数据与AI的可靠性

Barr Moses, Michael Segner

Publisher Resources

ISBN: 9787111753384