October 2018
Beginner to intermediate
348 pages
10h
English
As previously mentioned, once SSTable files are written to disk, they are immutable (cannot be written to again). Additional writes for that table would eventually result in additional SSTable files. As this process continues, it is possible for long-standing rows of data to be spread across multiple files. Reading multiple files to satisfy a query eventually becomes slow, especially when considering how obsolete data and tombstones must be reconciled (so as not to end up in the result set).
Apache Cassandra's answer to this problem is to periodically execute a process called compaction. When compaction runs, it performs the following functions: