Skip to Content
数据工程之道:设计和构建健壮的数据系统
book

数据工程之道:设计和构建健壮的数据系统

by Joe Reis, Matt Housley
February 2024
Intermediate to advanced
370 pages
7h
Chinese
China Machine Press
Content preview from 数据工程之道:设计和构建健壮的数据系统
173
6
存储
存储是数据工程生命周期的基石(如图
6-1
所示),并且是数据获取、转换和服务主要阶
段的基础。当数据在生命周期中移动时,它会被多次存储。套用一句老话:存储无处不
在。无论数据在几秒、几分钟、几天、几个月或几年后被需要,它都必须在存储中持续
存在,直到系统准备好消费它以进一步处理和传输。了解数据的使用情况和你将来检索
它的方式是为你的数据架构选择合适的存储解决方案的第一步。
数据工程生命周期
生成
分析
数据管理
DataOps
获取
机器学习
转换
反向 ETL
数据架构
服务
底层设计
编排
存储
安全 软件工程
6-1:存储在数据工程生命周期中至关重要
我们在第
5
章也讨论了存储问题,但更侧重于控制领域。源系统通常不是由数据工程师
维护或控制的。数据工程师直接处理的存储,也就是我们在本章中重点讨论的存储,包
含了数据工程生命周期的各个阶段,包括从源系统中提取数据,到为数据提供分析、数
据科学研究等价值。许多形式的存储以某种方式贯穿了整个数据工程生命周期。
要了解存储,我们首先要研究构成存储系统的原材料,包括硬盘、固态硬盘和系统内存
 
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

设计数据密集型应用程序

设计数据密集型应用程序

Martin Kleppmann
Understanding DeFi

Understanding DeFi

Alexandra Damsker
INSPIRED

INSPIRED

Marty Cagan

Publisher Resources

ISBN: 9787111745273