Skip to Content
Kafka权威指南(第2版)
book

Kafka权威指南(第2版)

by Gwen Shapira, Todd Palino, Rajini Sivaram, Krit Petty
November 2022
Beginner to intermediate
346 pages
11h
Chinese
Posts & Telecom Press
Content preview from Kafka权威指南(第2版)
深入
Kafka
115
为分区和副本选好合适的
broker
之后,接下来要决定新分区应该被放在哪个目录。我们会
为每个分区分配目录,规则很简单:计算每个目录里的分区数量,新分区总是会被放在分
区数量最少的那个目录。也就是说,如果添加了一个新磁盘,那么所有新分区都会被放到
这个磁盘。这是因为在达到均衡分配的状态之前,新磁盘的分区数量总是最少的。
注意磁盘空间
需要注意的是,在将分区分配给
broker
时,我们不会考虑可用空间和工作负
载,而在将分区分配给磁盘时会考虑分区数量,但不考虑分区大小。如果有
broker
的磁盘空间比其他
broker
大(有可能是因为集群同时使用了旧服务
器和新服务器)、有些分区特别大,或者同一个
broker
上有大小不同的磁盘,
那么在分配分区时就要小心一些。
6.5.3
 文件管理
数据保留是
Kafka
的一个重要概念。
Kafka
不会一直保留数据,也不会一直等到消息被所
有消费者读取了之后才将其删除。相反,
Kafka
管理员会为每个主题配置数据保留期限
主题的数据要么在达到指定的时间之后被清除,要么在达到指定的数量之后被清除。
在一个大文件中查找和删除消息既费时又很容易出错,所以我们会把分区分成若干个
。在默认情况下,每个片段包含
1 GB
或一周的数据
,以较小的那个为准。在
broker
分区写入数据时,如果触及任意一个上限,就关闭当前文件,并打开一个新文件。
当前正在写入数据的片段叫作
活动片段
。活动片段永远不会被删除,所以,如果你配置的
保留时间是
1
,但片段里包含了
5
天的数据,那么这些数据就会被保留 ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

时间序列分析实战:基于机器学习和统计学

时间序列分析实战:基于机器学习和统计学

Aileen Nielsen
Spark机器学习实战

Spark机器学习实战

Posts & Telecom Press, Siamak Amirghodsi, Meenakshi Rajendran, Broderick Hall, Shuen Mei
写给系统管理员的Python脚本编程指南

写给系统管理员的Python脚本编程指南

Posts & Telecom Press, Ganesh Sanjiv Naik
Kubernetes编程

Kubernetes编程

Michael Hausenblas, Stefan Schimanski

Publisher Resources

ISBN: 9787115601421