book

HBASE技術手冊

Name: HBASE技術手冊
Author: Lars George
ISBN: 9789862765968

by Lars George

December 2012

Intermediate to advanced

572 pages

13h 8m

Chinese

GoTop Information, Inc.

Read now

Unlock full access

Content preview from HBASE技術手冊

與 MapReduce 整合

317

在相同類別中的

map()

會執行實際的工作。就像所說的一樣，它會在讀取輸入文字檔的

每一列資料時被呼叫，而每列包含了一個 JSON 記錄。程式會對列的資料內容利用 MD5

雜湊建立一個 HBase 的 rowkey，然後將資料儲存到名稱為

data:json

的欄位中。

在範例中，由

TableOutputFormat

類別做設定，使用一個明確的寫入緩衝區（write

buffer）。呼叫

context.write()

，由指定的

Put

實例物件，發送一個內部的

table.put()

命令。當作業完成時，

TableOutputFormat

會小心的呼叫

ﬂushCommits()

－將保留在寫入

緩衝區的資料進行儲存。

map()

方法利用

Put

實例物件來儲存輸入資料，你也可以利用

Delete

實例

物件來從目標資料表中刪除資料。這也許是為什麼作業中的輸出 key 格式

會設定成

Writable

型別，取代明確的

Put

類別。

TableOutputFormat

可以（目前）只處理

Put

和

Delete

實例物件。忽略任

何的訊息或者丟出一個

IOException

，在訊息中設定

Pass

、一個

Delete

或

Put

。

最後，要注意作業只有使用 map 階段，而且並不需要 reduce。這是結合 HBase 相當典

型的 MapReduce 作業：因為資料已經被儲存在排序資料表，或者原始資料已經有了唯

一的鍵值，你可以避免那些需要額外成本的

排序

、

洗牌

和處理程序中的

reduce

階段。

資料來源

在原始資料匯入到資料表後 ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

O’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.

Julian F.

Head of Cybersecurity

I wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.

Addison B.

Field Engineer

I’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.

Amir M.

Data Platform Tech Lead

I'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.

Mark W.

Embedded Software Engineer

Publisher Resources

ISBN: 9789862765968

Cloud Computing

Data Engineering

Data Science

AI & ML

Programming Languages

Software Architecture

IT/Ops

Security

Design

Business

Soft Skills

HBASE技術手冊

by Lars George

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

More than 5,000 organizations count on O’Reilly

Julian F.

Addison B.

Amir M.

Mark W.

You might also like

Java数据科学实战

SQL经典实例（第2版）

深度强化学习与GAN课程：深度学习中的高级主题

機器學習｜工作現場的評估、導入與實作

Publisher Resources