HBase is the NoSQL datastore in the Hadoop ecosystem. Integration with a database is essential for Spark. It can read data from an HBase table or write to one. In fact, Spark supports HBase very well via the
If you want to experiment with HBase, you can install a standalone local version of HBase, as described in http://hbase.apache.org/book.html#quickstart.
Before working through the examples, let's create a table and three records in HBase. For testing, you can install a local standalone version of HBase that works from the local filesystem. So there's no need for Hadoop or HDFS. However, this won't be suitable for production.
I created a
test table with three records via the HBase shell, as shown in the following ...