Chapter 4. Client API: Advanced Features
Now that you understand the basic client API, we will discuss the advanced features that HBase offers to clients.
Filters
HBase filters are a powerful feature that can greatly enhance your effectiveness when working with data stored in tables. You will find predefined filters, already provided by HBase for your use, as well as a framework you can use to implement your own. You will now be introduced to both.
Introduction to Filters
The two prominent read functions for HBase are get() and scan(), both supporting either direct access
to data or the use of a start and end key, respectively. You can limit
the data retrieved by progressively adding more limiting selectors to
the query. These include column families, column qualifiers, timestamps
or ranges, as well as version number.
While this gives you control over what is
included, it is missing more fine-grained features, such as selection of
keys, or values, based on regular expressions. Both classes support
filters for exactly these reasons: what cannot be
solved with the provided API functionality to filter row or column keys,
or values, can be achieved with filters. The base interface is aptly named Filter, and there is a list of concrete
classes supplied by HBase that you can use without doing any
programming.
You can, on the other hand, extend the
Filter classes to implement your own
requirements. All the filters are actually applied on the server side,
also called predicate pushdown. This ...