Apart from the client API used to deal with data manipulation features, HBase also exposes a data definition-like API. This is similar to the DDL and DML separation found in RDBMSes. First we will look at the classes used by this HBase DDL defining data schemas and subsequently the API that makes use of these classes, for example, creating new HBase tables. These APIs and other operator functions comprise the HBase administration API and are described below.
Creating a table in HBase implicitly involves the definition of a table schema, as well as the schemas for all contained column families. They define the pertinent characteristics of how—and when—the data inside the table and columns is ultimately stored. On a higher level, every table is part of a namespace, and we will start with their defining data structures first.
Namespaces were introduced into HBase to solve the problem of organizing many tables.1 Before this feature, you had a flat list of all tables, including the system catalog tables. This—at scale—was causing difficulties when you had hundreds and hundreds of tables. With namespaces you can organize your tables into groups, where related tables can be handled together. On top of this, namespaces allow the further abstraction of generic concepts, such as security. You can define access control on the namespace level to quickly apply the rules to all contained tables.
HBase creates two ...