Skip to Content
数据工程之道:设计和构建健壮的数据系统
book

数据工程之道:设计和构建健壮的数据系统

by Joe Reis, Matt Housley
February 2024
Intermediate to advanced
370 pages
7h
Chinese
China Machine Press
Content preview from 数据工程之道:设计和构建健壮的数据系统
查询、建模和转换
|
249
OLAP
数据库还提供了额外的工具来优化你的表以提高查询性能。例如,如果你有一个
非常大的表(几
TB
或更大)
Snowflake
BigQuery
允许你在表上定义一个集群键,它
可以将数据排序,从而更高效地访问非常大的数据集。
BigQuery
还允许你将表分割成更
小的部分,允许你只查询特定的部分,而不是整个表。(请注意,不适当的聚类和键分配
策略会降低性能。
在行式数据库中,剪枝通常以表索引为中心,这一点你在第
6
章中学过。一般的策略是
创建表索引以提高对性能最敏感的查询的性能,同时不要让表的索引太多,以免降低
性能。
了解你的数据库如何处理提交
数据库的
提交
是指在数据库中产生的一个变更,如创建、更新或删除一条记录、表或其
他数据库对象。许多数据库都支持
事务
,即以一种保持一致状态的方式同时提交几个操
作的概念。请注意,
事务
这个词有点过重了,参见第
5
章。事务的目的是在数据库处于
健康状态和发生故障时,保持数据库的一致状态。当多个并发事件在同一数据库对象中
进行读、写和删除时,事务也会处理隔离问题。如果没有事务,用户在查询数据库时就
会得到潜在的不一致的信息。
你应该非常熟悉你的数据库如何处理提交和事务,并确认查询结果预期的一致性。你的
数据库是否以符合
ACID
标准的方式处理写入和更新?如果不符合
ACID
,你的查询可
能会返回意外的结果。这可能是由于脏读造成的,脏读是指在一行数据被读取的同时,
另一个未提交的事务改变了该行数据。脏读是你数据库的预期行为吗?如果是的话,你
是如何处理的?另外
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

设计数据密集型应用程序

设计数据密集型应用程序

Martin Kleppmann
Understanding DeFi

Understanding DeFi

Alexandra Damsker
INSPIRED

INSPIRED

Marty Cagan

Publisher Resources

ISBN: 9787111745273