Cassandra is known as a distributed database. A Cassandra cluster is a collection of nodes (individual instances running Cassandra) all working together to serve the same dataset. Nodes can also be grouped together into logical data centers. This is useful for providing data locality for an application or service layer, as well as for working with Cassandra instances that have been deployed in different regions of a public cloud.
Cassandra clusters can scale to suit both expanding disk footprint and higher operational throughput. Essentially, this means that each cluster becomes responsible for a smaller percentage of the total data size. Assuming that the 500 GB disks of a six node cluster (RF of three) start to reach their maximum ...