Chapter 7. Database Sharding Pattern
This advanced pattern focuses on horizontally scaling data through sharding.
To shard a database is to start with a single database and then divvy up its data across two or more databases (shards). Each shard has the same database schema as the original database. Most data is distributed such that each row appears in exactly one shard. The combined data from all shards is the same as the data from the original database.
The collection of shards is a single logical database, even though there are now multiple physical databases involved.
The Database Sharding Pattern is effective in dealing with the following challenges:
Application database query volume exceeds the query capability of a single database node resulting in unacceptable response times or timeouts
Application database update volume exceeds the transactional capability of a single database node resulting in unacceptable response times or timeouts
Application database network bandwidth needs exceed the bandwidth available to a single database node resulting in unacceptable response times or timeouts
Application database storage requirements exceed the capacity of a single database node
This chapter assumes sharding is done with a database service that offers integrated sharding support. Without integrated sharding support, sharding happens entirely in the application layer, which is substantially more complex.
Historically, sharding has not been a popular pattern because ...