Skip to Content
HBASE技術手冊
book

HBASE技術手冊

by Lars George
December 2012
Intermediate to advanced
572 pages
13h 8m
Chinese
GoTop Information, Inc.
Content preview from HBASE技術手冊
MapReduce 整合
|
305
被送到在相同的機架上的另一台伺服器上,其他的則會被送到一個遠端機架上-假設你
HDFS 是架在多機架環境上。如果不是,那額外的複本會被放在叢集上負載最低的一
個資料節點上。
假如複本係數(replication factor)配置得較高,那麼將會有更多的複本會被配置到不
同的主機上。不過,這裡很重要的是:現在你有一個區塊的本地端複本可以使用。對於
HBase 而言,這意謂著如果 region 伺服器運行時間夠長(這是你想要的),在對所有的
資料表做完整壓實後-這可以被手動地執行或者依據組態設定所觸發-檔案會被本地化
地儲存在相同的主機上。各個共享相同實體主機的資料節點,會保有區域伺服器需要的
所有資料的複本。如果你正在執行 scan get 或任何的操作,就可以確保獲得最佳的效
能。
有一個問題必須要注意,就是在負載平衡或伺服器錯誤期間的區域搬移問題。在這些情
況下,資料已經不再是本地化,但隨著時間的移動,它會再次變成本地化。當叢集被重
新啟動時,主節點也將此列入考量:它會指派所有的區域到原始的區域伺服器上,如果
其中有一個無法對應,它會使用隨機配置區域的策略。
資料表分割
MapReduce 作業運作時,一般而言會使用
TableInputFormat
從資料表中讀取資料,
透過覆寫必要的
getSplits()
createRecordReader()
公用方法,來融入到框架當中。
在作業被執行之前,框架會呼叫
getSplit()
以決定資料如何被分割成區塊,它會設定作 ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

Java数据科学实战

Java数据科学实战

Michael R. Brzustowicz, PhD
SQL经典实例(第2版)

SQL经典实例(第2版)

Anthony Molinaro, Robert de Graaf
機器學習|工作現場的評估、導入與實作

機器學習|工作現場的評估、導入與實作

有賀康顕, 中山心太, 西林孝

Publisher Resources

ISBN: 9789862765968