Skip to Content
HBASE技術手冊
book

HBASE技術手冊

by Lars George
December 2012
Intermediate to advanced
572 pages
13h 8m
Chinese
GoTop Information, Inc.
Content preview from HBASE技術手冊
MapReduce 整合
|
317
在相同類別中的
map()
會執行實際的工作。就像所說的一樣,它會在讀取輸入文字檔的
每一列資料時被呼叫,而每列包含了一個 JSON 記錄。程式會對列的資料內容利用 MD5
雜湊建立一個 HBase rowkey,然後將資料儲存到名稱為
data:json
的欄位中。
在範例中,由
TableOutputFormat
類別做設定,使用一個明確的寫入緩衝區(write
buffer)。呼叫
context.write()
,由指定的
Put
實例物件,發送一個內部的
table.put()
命令。當作業完成時,
TableOutputFormat
會小心的呼叫
flushCommits()
-將保留在寫入
緩衝區的資料進行儲存。
map()
方法利用
Put
實例物件來儲存輸入資料,你也可以利用
Delete
實例
物件來從目標資料表中刪除資料。這也許是為什麼作業中的輸出 key 格式
會設定成
Writable
型別,取代明確的
Put
類別。
TableOutputFormat
可以(目前)只處理
Put
Delete
實例物件。忽略任
何的訊息或者丟出一個
IOException
,在訊息中設定
Pass
、一個
Delete
Put
最後,要注意作業只有使用 map 階段,而且並不需要 reduce。這是結合 HBase 相當典
型的 MapReduce 作業:因為資料已經被儲存在排序資料表,或者原始資料已經有了唯
一的鍵值,你可以避免那些需要額外成本的
排序
洗牌
和處理程序中的
reduce
階段。
資料來源
在原始資料匯入到資料表後 ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

Java数据科学实战

Java数据科学实战

Michael R. Brzustowicz, PhD
SQL经典实例(第2版)

SQL经典实例(第2版)

Anthony Molinaro, Robert de Graaf
機器學習|工作現場的評估、導入與實作

機器學習|工作現場的評估、導入與實作

有賀康顕, 中山心太, 西林孝

Publisher Resources

ISBN: 9789862765968