Skip to Content
数据工程之道:设计和构建健壮的数据系统
book

数据工程之道:设计和构建健壮的数据系统

by Joe Reis, Matt Housley
February 2024
Intermediate to advanced
370 pages
7h
Chinese
China Machine Press
Content preview from 数据工程之道:设计和构建健壮的数据系统
96
|
3
不变:降低复杂性,提高模块化程度。请注意现代数据栈的概念与
3.4.3
节的融合数据
平台的概念很好地结合在一起。
数据源
基于云的数据
连接器和集成
云数据
仓库
BI
可视化
3-13:现代数据技术栈的基本组件
现代数据栈的主要成果是自助服务(分析和管道)、敏捷数据管理以及使用开源工具或具
有明确定价结构的简单专有工具。社区也是现代数据栈的核心方面。与过去对用户隐藏
发布和路线图的产品不同,在现代数据栈空间中运营的项目和公司通常拥有参与开发的
强大的用户群和活跃的社区,它们通过尽早使用产品、建议功能和提交拉取请求来改进
代码。
无论“现代”走向何方(我们在第
11
章分享我们的想法),我们认为具有易于理解的定
价和实施的即插即用模块化的关键概念是未来的发展方向。特别是在分析工程中,现代
数据栈现在是并将继续是数据架构的默认选择。在整本书中,我们引用的架构包含现代
数据堆栈的各个部分,例如基于云和即插即用的模块化组件。
3.4.5 Lambda
架构
在“过去”(
21
世纪
10
年代初期至中期),随着
Kafka
作为高度可扩展的消息队列和用
于流式
/
实时分析的
Apache Storm
Samza
等框架的出现,使用流数据的流行度呈爆
炸式增长。这些技术使公司能够对大量数据、用户聚合和排名,以及产品推荐执行新
型分析和建模。数据工程师需要弄清楚如何将批处理和流处理数据协调到一个架构中。
Lambda
架构是对这个问题的早期流行回应之一。
Lambda
架构
中(如图
3-14
所示),你的系统彼此独立运行
批处理、流处理和服 ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

设计数据密集型应用程序

设计数据密集型应用程序

Martin Kleppmann
Understanding DeFi

Understanding DeFi

Alexandra Damsker
INSPIRED

INSPIRED

Marty Cagan

Publisher Resources

ISBN: 9787111745273