Skip to Content
Data Lake for Enterprises
book

Data Lake for Enterprises

by Vivek Mishra, Tomcy John, Pankaj Misra
May 2017
Beginner to intermediate
596 pages
15h 2m
English
Packt Publishing
Content preview from Data Lake for Enterprises

Apache HBase

Apache HBase is the Data storage component on top of Hadoop using HDFS as the storage. HBase is non-relational (NoSQL) and distributed in nature and belongs to column family oriented database. It is good for random reads and batch operations. HBase is capable of handling large datasets with millions of rows and columns.

Apache HBase is modeled after Google’s Bigtable and is considered one of the best implementations of it in the industry and internally, it is a sorted map in implementation.

HBase has multiple APIs, the main one being the Java API. In addition to this, it also has the REST (for HTTP access) and Thrift (for other language programming access) APIs.

HBase is quite useful for handling use cases dealing with real-time ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

The Enterprise Big Data Lake

The Enterprise Big Data Lake

Alex Gorelik
Operationalizing the Data Lake

Operationalizing the Data Lake

Holden Ackerman, Jon King
Data Lakes

Data Lakes

Anne Laurent, Dominique Laurent, Cédrine Madera

Publisher Resources

ISBN: 9781787281349Supplemental Content