Appendix F. HBase Versus Bigtable

Overall, HBase implements close to all of the features described in Chapter 1. Where it differs, it may have to because either the Bigtable paper was not very clear to begin with, or it relies on other open source projects to provide various services and those simply work differently.

HBase stores timestamps in milliseconds—as opposed to Bigtable, which uses microseconds. This is not much of an issue and can possibly be attributed to C and Java having different preferred timer resolutions.

While we have not yet addressed the specific details, it should be pointed out that both also use different compression algorithms. HBase uses those supplied in Java, but can also use LZO (with a bit of work; we will look into this later).[132] Bigtable has a two-phase compression using BMDiff and Zippy.

HBase has coprocessors that are different from what Sawzall, the scripting language used in Bigtable to filter or aggregate data, or the Bigtable Coprocessor framework,[133] provides. The details on Google’s coprocessor implementation are rather sketchy, so if there are more differences, they are unknown. On the other hand, HBase has support for server-side filters that help reduce the amount of data being moved from the server to the client.

HBase does primarily work with the Hadoop Distributed File System (HDFS), while Bigtable uses GFS. But HBase can also work on other filesystems thanks to the pluggable FileSystem class provided by Hadoop. There are implementations ...

Get HBase: The Definitive Guide now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.