Chapter 7. Database Sharding Pattern

This advanced pattern focuses on horizontally scaling data through sharding.

To shard a database is to start with a single database and then divvy up its data across two or more databases (shards). Each shard has the same database schema as the original database. Most data is distributed such that each row appears in exactly one shard. The combined data from all shards is the same as the data from the original database.

The collection of shards is a single logical database, even though there are now multiple physical databases involved.

Context

The Database Sharding Pattern is effective in dealing with the following challenges:

  • Application database query volume exceeds the query capability of a single database node resulting in unacceptable response times or timeouts

  • Application database update volume exceeds the transactional capability of a single database node resulting in unacceptable response times or timeouts

  • Application database network bandwidth needs exceed the bandwidth available to a single database node resulting in unacceptable response times or timeouts

  • Application database storage requirements exceed the capacity of a single database node

This chapter assumes sharding is done with a database service that offers integrated sharding support. Without integrated sharding support, sharding happens entirely in the application layer, which is substantially more complex.

Cloud Significance

Historically, sharding has not been a popular pattern because ...

Get Cloud Architecture Patterns now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.