Using secondary indexes to avoid denormalization
So far, we've exclusively used primary key columns to look up rows—either the full primary key when we're looking for a specific row, or just the partition key when retrieving multiple rows in a single partition. We know that these kinds of lookups are very efficient, because Cassandra can satisfy the query by accessing the single region of storage that holds the partition's data in order.
This is the motivation for the denormalized follow structure we've built in this chapter: whether we want to answer the question, "Who does alice
follow?", or the question, "Who follows alice
?", we can construct a query that only needs to access a single partition. However, we're accepting additional complexity ...
Get Learning Apache Cassandra now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.