Chapter 2. Data Distribution

Cassandra's peer-to-peer architecture and scalability characteristics are directly tied to its data placement scheme. Cassandra employs a distributed hash table data structure that allows for data to be stored and retrieved by key quickly and efficiently. Consistent hashing is at the core of this strategy, as it enables all nodes to understand where data exists in the cluster without complicated coordination mechanisms.

In this chapter, we'll cover the following topics:

The fundamentals of distributed hash tables
Cassandra's consistent hashing mechanism
Token assignment, both manual and using vnodes
The implications of Cassandra's partitioner implementations
How hotspots form in the cluster

By the time you finish this ...

Get Cassandra 3.x High Availability - Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Cassandra 3.x High Availability - Second Edition by Robbie Strickland

Chapter 2. Data Distribution

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly