Chapter 5. Clustering

In Chapter 4, we looked at load balancing. Load balancing is very useful, but it alone may not provide you with the scale you need. Sometimes it is necessary to partition your data across multiple shards. Each shard lives on a CouchDB node and contains a subset of your data. You can have one or more shards on each node. CouchDB does not natively support this form of clustering. However, there are third-party tools that allow you to create a cluster of CouchDB nodes. These tools include BigCouch, Lounge, and Pillow.

Note

An alternative to automatic partitioning is to manually partition your documents into different databases by type of document. The downside to this approach is that only documents in the same database can be included in any given view. If you have documents that don’t need to be queried in the same view, putting them in separate databases can allow you to use CouchDB as-is without needing a third-party tool.

BigCouch

BigCouch is a fork of CouchDB that introduces additional clustering capabilities. It is available under an open source license and is maintained by Cloudant. For the most part, you can interact with a BigCouch node exactly the same way you would interact with a CouchDB node. BigCouch introduces some new API endpoints that are needed to manage its clustering features.

Warning

BigCouch is actively being developed. As of this writing, the current version of BigCouch was 0.3. Features and capabilities may change in future releases.

Get Scaling CouchDB now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.