book

HBase High Performance Cookbook

Name: HBase High Performance Cookbook
Author: Ruchir Choudhry
ISBN: 9781783983063

by Ruchir Choudhry

January 2017

Intermediate to advanced

350 pages

7h 8m

English

Packt Publishing

Read now

Unlock full access

HBase High Performance Cookbook
Table of Contents
HBase High Performance Cookbook
Credits
About the Author
About the Reviewer
www.PacktPub.com
eBooks, discount offers, and moreWhy Subscribe?
Customer Feedback
Preface
What this book covers
What you need for this book

Who this book is for
Sections
Getting readyHow to do it…How it works…There's more…See also
Conventions
Reader feedback
Customer support
Downloading the example codeErrataPiracyQuestions
1. Configuring HBase
Introduction
Configuring and deploying HBase
Getting readyHow to do it…How it works…There's more…See Also
Using the filesystem
Getting readyHow to do it…The HBase setupStarting the clusterValidating the clusterHow it works…There is more…See also
Administering clusters
Getting readyHow to do it…Log dumpMetrics dumpHow it works…See also
Managing clusters
Getting readygmondgmetadgwebHow to do it…Ganglia setupHow it works…There is more…See also
2. Loading Data from Various DBs
Introduction
Extracting data from Oracle
Getting readyHow to do it…How it works…There's more…See also…
Loading data using Oracle Big data connector
Getting ReadyHow to do it…How it works…There's more…See also…
Bulk utilities
Getting ready...How to do it…How it works…See also…
Using Hive with Apache HBase
Getting readyHow to do it…How it works…See also…
Using Sqoop
Getting readyHow to do it…How it works…There's more…Data compressionParallelismSee also…
3. Working with Large Distributed Systems Part I
Introduction
Scaling elastically or Auto Scaling with built-in fault tolerance
How to do it…How it works…There's more…See also
Auto Scaling HBase using AWS
Getting ReadyHow to do it…There's more…See also
Works on different VM/physical, cloud hardware
Getting readyHow to do it…There's more…See also
4. Working with Large Distributed Systems Part II
IntroductionSeek versus transferThe log-structured merge-treeDate ReadData DeleteStorage
Read path
How to do it…There's more…
Write Path
How to do it…How it works…There's more…Transactions (ACID) and multiversion concurrency control (MVCC)
Snappy
How to do it…How it works…There's more…
LZO compression
How to do it…How it works...There's more…
LZ4 compressor
How to do it…There's more…
Replication
How to do it…Deploying Master-Master or Cyclic ReplicationHow it works...There's more…Disabling Replication at the Peer Level
5. Working with Scalable Structure of tables
Introduction
HBase data model part 1
How to do it…How it works…There's more…
HBase data model part 2
How to do it…How it works…There's more…
How HBase truly scales on key and schema design
How to do it…See also
6. HBase Clients
Introduction
HBase REST and Java Client
How to do it…How it works…There's more…
Working with Apache Thrift
How to do it…How it works…There's more…
Working with Apache Avro
How to do it…How it works…There's more…
Working with Protocol buffer
How to do it…There's More…
Working with Pig and using Shell
How to do it…How it works…There's more…
7. Large-Scale MapReduce
IntroductionGetting Ready…How to do it…How it works…There's more…When not to use MapReduceSee also…
8. HBase Performance Tuning
Introduction
Working with infrastructure/operating systems
Getting ready…How to do it…
Working with Java virtual machines
Getting ready…How to do it…See also
Changing the configuration of components
Getting ready…How to do it…See also
Working with HDFS
How to do it…See also….
9. Performing Advanced Tasks on HBase
Machine learning using HbaseGetting readyHow to do it…RDBMSA plain Java program (static)There's more…
Real-time data analysis using Hbase and Mahout
How to do it…How it works...There's More…
Full text indexing using Hbase
Getting readyHow to do it…How it works…There's more…
10. Optimizing Hbase for Cloud
Introduction
Configuring Hbase for the Cloud
How to do it…How it works…
Connecting to an Hbase cluster using the command line
How to do it…How it works…
Backing up and restoring Hbase
How to do it…How it works…
Terminating an HBase cluster
How to do it…
Accessing HBase data with hive
How to do it …
Viewing the Hbase user interface
How to do it …
Monitoring HBase with CloudWatch
Monitoring Hbase with Ganglia
How it works…There is more …
11. Case Study
Introduction
Configuring Lily Platform
How to do it…There's more…
Integrating elastic search with Hbase
Configuring
How to do it…There's more…
Index

Overview

"HBase High Performance Cookbook" is your guide to mastering the optimization, scaling, and tuning of HBase systems. Covering everything from configuring HBase clusters to designing scalable table structures and performance tuning, this comprehensive book provides practical advice and strategies for leveraging HBase's full potential. By following this book's recipes, you'll supercharge your HBase expertise.

What this Book will help me do

Understand how to configure HBase for optimal performance, improving your data system's efficiency.
Learn to design table structures to maximize scalability and functionality in HBase.
Gain skills in performing CRUD operations and using advanced features like MapReduce within HBase.
Discover practices for integrating HBase with other technologies such as ElasticSearch.
Master the steps involved in setting up and optimizing HBase in cloud environments for enhanced performance.

Author(s)

Ruchir Choudhry is a seasoned data management professional with extensive experience in distributed database systems. He possesses deep expertise in HBase, Hadoop, and other big data technologies. His practical and engaging writing style aims to demystify complex technical topics, making them accessible to developers and architects alike.

Who is it for?

This book is tailored for developers and system architects looking to deepen their understanding of HBase. Whether you are experienced with other NoSQL databases or are new to HBase, this book provides extensive practical knowledge. Ideal for professionals working in big data applications or those eager to optimize and scale their database systems effectively.

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

O’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.

Julian F.

Head of Cybersecurity

I wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.

Addison B.

Field Engineer

I’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.

Amir M.

Data Platform Tech Lead

I'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.

Mark W.

Embedded Software Engineer

Publisher Resources

ISBN: 9781783983063

Cloud Computing

Data Engineering

Data Science

AI & ML

Programming Languages

Software Architecture

IT/Ops

Security

Design

Business

Soft Skills