book

Hbase Essentials

Name: Hbase Essentials
Author: Nishant Garg
ISBN: 9781783987245

by Nishant Garg

November 2014

Beginner to intermediate

164 pages

3h 32m

English

Packt Publishing

Read now

Unlock full access

HBase Essentials
Table of Contents
HBase Essentials
Credits
About the Author
About the Reviewers
www.PacktPub.com
Support files, eBooks, discount offers, and moreWhy subscribe?Free access for Packt account holders
Preface
What this book covers
What you need for this book
Who this book is for

Conventions
Reader feedback
Customer support
Downloading the example codeErrataPiracyQuestions
1. Introducing HBase
The world of Big Data
The origin of HBase
Use cases of HBase
Installing HBase
Installing Java 1.7The local modeThe pseudo-distributed modeThe fully distributed mode
Understanding HBase cluster components
Start playing
Summary
2. Defining the Schema
Data modeling in HBase
Designing tables
Accessing HBase
Establishing a connectionCRUD operationsWriting dataReading dataUpdating dataDeleting data
Summary
3. Advanced Data Modeling
Understanding keys
HBase table scans
Implementing filters
Utility filtersComparison filtersCustom filters
Summary
4. The HBase Architecture
Data storageHLog (the write-ahead log – WAL)HFile (the real data storage file)
Data replication
Securing HBase
Enabling authenticationEnabling authorizationConfiguring REST clients
HBase and MapReduce
Hadoop MapReduceRunning MapReduce over HBaseHBase as a data sourceHBase as a data sinkHBase as a data source and sink
Summary
5. The HBase Advanced API
CountersSingle countersMultiple counters
Coprocessors
The observer coprocessorThe endpoint coprocessor
The administrative API
The data definition APITable name methodsColumn family methodsOther methodsThe HBaseAdmin API
Summary
6. HBase Clients
The HBase shellData definition commandsData manipulation commandsData-handling tools
Kundera – object mapper
CRUD using KunderaQuery HBase using KunderaUsing filters within query
REST clients
Getting startedThe plain formatThe XML formatThe JSON format (defined as a key-value pair)The REST Java client
The Thrift client
Getting started
The Hadoop ecosystem client
Hive
Summary
7. HBase Administration
Cluster managementThe Start/stop HBase clusterAdding nodesDecommissioning a nodeUpgrading a clusterHBase cluster consistencyHBase data import/export toolsCopy table
Cluster monitoring
The HBase metrics frameworkMaster server metricsRegion server metricsJVM metricsInfo metricsGangliaNagiosJMXFile-based monitoring
Performance tuning
CompressionAvailable codecsLoad balancingSplitting regionsMerging regionsMemStore-local allocation buffersJVM tuningOther recommendations
Troubleshooting
Summary
Index

Content preview from Hbase Essentials

Cluster monitoring

In large distributed systems, an administrator handles the difficult task of being aware of the overall status of the system, as well as knowing about each server separately. In disaster-like situations, it is difficult to know when and how it got started just by looking at a handful of raw logfiles.

HBase cluster (another distributed system running on top of Hadoop) administrators need to continuously ensure that the cluster is up and operating as expected. For such difficult tasks, HBase provides a large number of metrics that provide details regarding their current status.

There are different solutions provided that can be further grouped into graphing and monitoring solutions or both. Here, graphing solutions, such as Ganglia ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

O’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.

Julian F.

Head of Cybersecurity

I wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.

Addison B.

Field Engineer

I’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.

Amir M.

Data Platform Tech Lead

I'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.

Mark W.

Embedded Software Engineer

Publisher Resources

ISBN: 9781783987245

Cloud Computing

Data Engineering

Data Science

AI & ML

Programming Languages

Software Architecture

IT/Ops

Security

Design

Business

Soft Skills

Hbase Essentials

by Nishant Garg

Cluster monitoring

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.