Skip to Content
数据工程之道:设计和构建健壮的数据系统
book

数据工程之道:设计和构建健壮的数据系统

by Joe Reis, Matt Housley
February 2024
Intermediate to advanced
370 pages
7h
Chinese
China Machine Press
Content preview from 数据工程之道:设计和构建健壮的数据系统
276
|
8
绝大部分与我们交谈过的流数据专家都建议预测源数据的变化,并保持一个灵活的模
式。这意味着在分析数据库中没有固定的数据模型。否则,我们只能假设源系统就像今
天存在的那样提供正确的数据以及正确的业务定义和逻辑。因为存储很便宜,所以可以
将近期发生的增量数据和历史的存量数据放在一起查询。我们需要针对具有灵活模式的
数据集进行综合分析和优化。此外,与其对报表端提出的异常做出响应,为什么不使用
自动化的方式对流数据中的异常和变化做出反应呢?
数据建模这个领域正在发生变化,我们相信数据模型范式很快就会发生巨大变化。这些
新的方法可能会将指标和语义层、数据管道和传统的分析工作流纳入直接位于源系统之
上的流数据层。由于数据是实时生成的,人为地将源系统和分析系统分成两部分可能不
像数据移动更慢、更可预测时那样有意义。时间会告诉我们答案。
关于流数据的未来,我们在第
11
章做更多介绍。
8.3
转换
数据转换的净收益是统一和整合数据的能力。当数据被转换的时候,数据可以看作
一个单一的实体。但是,如果不对数据进行转换,你就无法在整个组织内对数据有
一个统一的看法。
Bill Inmon
13
我们已经介绍了查询和数据建模,你可能会想如果我可以对数据进行建模、查询并获得
结果,为什么我还需要考虑转换呢?数据转换可以为下游修改、增强和保存数据,以可
扩展、可靠和经济的方式增加其价值。
1
想象一下当你每次想查看某个特定数据集的结果时都要执行一个查询。你每天要运行相
同的查询几十次或几百次。假设这个查询涉及
20
个数据集的解析 ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

设计数据密集型应用程序

设计数据密集型应用程序

Martin Kleppmann
Understanding DeFi

Understanding DeFi

Alexandra Damsker
INSPIRED

INSPIRED

Marty Cagan

Publisher Resources

ISBN: 9787111745273