Skip to Content
数据工程之道:设计和构建健壮的数据系统
book

数据工程之道:设计和构建健壮的数据系统

by Joe Reis, Matt Housley
February 2024
Intermediate to advanced
370 pages
7h
Chinese
China Machine Press
Content preview from 数据工程之道:设计和构建健壮的数据系统
查询、建模和转换
|
267
行。最后,事实表不引用其他事实表,它们只引用维度。
让我们看一下一个基本事实表的例子(如表
8-9
所示)。在你的公司中,一个常见的问题
可能是这样的:显示每个客户每天每个订单的销售总额。事实表应该是细粒度的
这个例子中需要包含销售的订单
ID
、客户、日期和销售总额。注意,事实表中的数据类
型都是数字(整数和浮点数),没有字符串。在这个例子中,
Customer Key
等于
7
的用
户在同一天有两个订单,这反映了表的粒度。相反,事实表的键引用了不同的维度表,
如客户和日期信息等。销售总额代表销售
事件
的总销售额。
8-9:一个基本事实表
OrderID CustomerKey DateKey GrossSalesAmt
100 5 20220301 100.00
101 7 20220301 75.00
102 7 20220301 50.00
维度表
Kimball
数据模型中的第二种类型的表是
维度表
。维度表为存储在事实表中的
事件提供参考数据、属性和关系上下文。维度表比事实表小(形状相反),通常是宽而
短。当连接到事实表时,维度可以描述事件的内容、地点和时间。维度是去范式化的,
有可能出现重复的数据。这在
Kimball
数据模型中是允许的。让我们来看看前面事实表
例子中提到的两个维度。
Kimball
数据模型中,日期通常被存储在一个日期维度中,允许你在事实和日期维度
表之间引用日期键(
DateKey
。通过日期维度表,你可以很容易地回答这样的问题:“我
2022
年第一季度的总销售额是多少 ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

设计数据密集型应用程序

设计数据密集型应用程序

Martin Kleppmann
Understanding DeFi

Understanding DeFi

Alexandra Damsker
INSPIRED

INSPIRED

Marty Cagan

Publisher Resources

ISBN: 9787111745273