Data corruption is rare but it does happen, a phenomenon described scientifically as bit-rot. Sometimes we write to a drive, and a surface or cell failure results in reads failing or returning something other than what we wrote. HBA misconfiguration, SAS expander flakes, firmware design flaws, drive electronics errors, and medium failures can also corrupt data. Surface errors affect between 1 in 1016 to as many as 1 in 1014 bits stored on HDDs. Drives can also become unseated due to human error or even a truck rumbling by. This author has also seen literal cosmic rays flip bits.

Ceph lives for strong data integrity, and has a mechanism to alert us of these situation: scrubs. Scrubs are somewhat analogous to fsck on a filesystem and ...

Get Learning Ceph - Second Edition now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.