Data replication is copying data from one cluster to another cluster by replicating the writes as the first cluster received it. Intercluster (geographically apart as well) replication in HBase is achieved by log shipping asynchronously. Data replication serves as a disaster recovery solution and also provides higher availability at the HBase layer.
The master-push pattern used by HBase replication keeps track of what is currently being replicated as each region server has its own write-ahead log. One master cluster can replicate any number of slave clusters. Each region server will participate to replicate its own batch (the default size is 64 MB) of write-ahead edit records contained within WAL.
The master-push pattern used for ...