Skip to Content
Ray 分布式机器学习:利用Ray 进行大模型的数据处理、训练、推理和部署
book

Ray 分布式机器学习:利用Ray 进行大模型的数据处理、训练、推理和部署

by Max Pumperla, Edward Oakes, Richard Liaw
May 2024
Intermediate
252 pages
5h 31m
Chinese
China Machine Press
Content preview from Ray 分布式机器学习:利用Ray 进行大模型的数据处理、训练、推理和部署
11
1
Ray 概述
人们正以前所未有的速度和广度采集海量数据,这是分布式计算大行其道的原
因之一。过去十年间,出现了一大批存储系统、数据处理和分析引擎,这些工
具对许多公司的成功至关重要。有趣的是,多数“大数据”技术是专为负责数
据采集和处理任务的数据工程师打造并使用的,这么做是为了让数据科学家专
注于擅长的工作。作为数据科学从业者,你可能希望专注于训练复杂的机器学
习模型、进行高效的超参数调优、创建全新的自定义模型或模拟,或者部署模
型以提供服务。
与此同时,将计算任务扩展到计算集群也是大势所趋。为了实现扩展,分布式
系统需要支持所有这些细粒度的“大计算”任务,可能还要使用专业的硬件。
理想情况下,硬件需要与正在使用的大数据工具链相匹配,并且速度足够快以
满足延迟要求。换句话说,分布式计算必须既强大且灵活,这样才能应对复杂
的数据科学计算任务,而
Ray
恰恰满足所有这些要求。
Python
是当下最流行的数据科学语言。对于数据科学中的日常工作,
Python
最常用的语言。虽然
Python
已经有
30
余年的历史,但仍然拥有不断增长和活
跃的社区。丰富的
PyData
生态(
https://pydata.org
)是数据科学家工具箱的重
要组成部分。在利用这些工具的同时,如何扩展计算任务呢?这是一个难题,
特别是因为
Python
社区不能被迫放弃现有工具或编程语言。这意味着必须为
Python
社区构建分布式计算工具。
  ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

通过可观测性确保数据与AI的可靠性

通过可观测性确保数据与AI的可靠性

Barr Moses, Michael Segner

Publisher Resources

ISBN: 9787111753384