O'Reilly logo

Learning Apache Cassandra - Second Edition by Sandeep Yarabarla

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Partial denormalization

Our initial approach to home timelines, which used the existing, fully-normalized data structure that we've already built, is technically viable but will perform very poorly at scale. If I follow F users and want a page of size P for my home timeline, Cassandra will need to do the following:

  • Query F partitions for P rows, each
  • Perform an ordered merge of FxP rows in order to retrieve only the most recent P

The most distressing part of this is the fact that both operations grow in complexity proportionally with the number of people I follow. Let's start by trying to fix this.

The basic goal of the home timeline is to show me the most recent status updates that matter to me. Instead of doing all the work to find out ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required