Replication

As the system grows and becomes more distributed, the need for data replication grows rapidly. It works on the core principle of moving a transactional data from one cluster to another. Usually, the master initiates the push to the slave. These transactions are usually done in an asynchronous manner. This is done to minimize the overhead on the master system. Usually, these transactions are done in a batch mode, and the size of the data packets can be controlled by the configuration size.

The benefits of HBase replication are as follows:

  • Data aggregation
  • Online data ingestion combined with offline data analysis
  • Geographic data distribution across multiple data centres
  • Backup and disaster recovery

How to do it…

  1. Let's edit hbase-site.xml:

Get HBase High Performance Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.