Cassandra write path is fairly straightforward: writes are appended to commitlog on disk and written to memtables in memory. The memtables are flushed to disk once they are full. The read path is more complex and utilizes a bunch of data structures (both in memory and on disk) to optimize reads and reduce disk seeks. Cassandra has to combine the data in memtables along with data on disk (potentially multiple SSTables) before returning the data.
These are the different components used by Cassandra to process a read query:
- Memtable
- Row cache
- Bloom filters
- Key cache
- Partition summary
- Partition index
- Compression offset map
- SSTables
- Memtable: This is the first stage in the read process. Cassandra initially checks whether the queried ...