book

HBASE技術手冊

Name: HBASE技術手冊
Author: Lars George
ISBN: 9789862765968

by Lars George

December 2012

Intermediate to advanced

572 pages

13h 8m

Chinese

GoTop Information, Inc.

Read now

Unlock full access

Content preview from HBASE技術手冊

與 MapReduce 整合

305

被送到在相同的機架上的另一台伺服器上，其他的則會被送到一個遠端機架上－假設你

的 HDFS 是架在多機架環境上。如果不是，那額外的複本會被放在叢集上負載最低的一

個資料節點上。

假如複本係數（replication factor）配置得較高，那麼將會有更多的複本會被配置到不

同的主機上。不過，這裡很重要的是：現在你有一個區塊的本地端複本可以使用。對於

HBase 而言，這意謂著如果 region 伺服器運行時間夠長（這是你想要的），在對所有的

資料表做完整壓實後－這可以被手動地執行或者依據組態設定所觸發－檔案會被本地化

地儲存在相同的主機上。各個共享相同實體主機的資料節點，會保有區域伺服器需要的

所有資料的複本。如果你正在執行 scan 或 get 或任何的操作，就可以確保獲得最佳的效

能。

有一個問題必須要注意，就是在負載平衡或伺服器錯誤期間的區域搬移問題。在這些情

況下，資料已經不再是本地化，但隨著時間的移動，它會再次變成本地化。當叢集被重

新啟動時，主節點也將此列入考量：它會指派所有的區域到原始的區域伺服器上，如果

其中有一個無法對應，它會使用隨機配置區域的策略。

資料表分割

當 MapReduce 作業運作時，一般而言會使用

TableInputFormat

從資料表中讀取資料，

透過覆寫必要的

getSplits()

和

createRecordReader()

公用方法，來融入到框架當中。

在作業被執行之前，框架會呼叫

getSplit()

以決定資料如何被分割成區塊，它會設定作 ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

O’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.

Julian F.

Head of Cybersecurity

I wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.

Addison B.

Field Engineer

I’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.

Amir M.

Data Platform Tech Lead

I'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.

Mark W.

Embedded Software Engineer

Publisher Resources

ISBN: 9789862765968

Cloud Computing

Data Engineering

Data Science

AI & ML

Programming Languages

Software Architecture

IT/Ops

Security

Design

Business

Soft Skills

HBASE技術手冊

by Lars George

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

More than 5,000 organizations count on O’Reilly

Julian F.

Addison B.

Amir M.

Mark W.

You might also like

Java数据科学实战

SQL经典实例（第2版）

深度强化学习与GAN课程：深度学习中的高级主题

機器學習｜工作現場的評估、導入與實作

Publisher Resources