Chapter 15. Operating a Swift Cluster
In this chapter we move from installation to the everyday operations of a Swift cluster. We’ll cover best practices for conducting day-to-day operational tasks, such as planning capacity additions and monitoring—whether you choose to do these in Swift or through SwiftStack. The recommendations and best practices in this chapter are based on our experiences building and operating both large and small clusters for a variety of workloads. By the end of this chapter, you’ll have a good understanding of how to operate and monitor a Swift cluster and how SwiftStack automates many of these processes through the SwiftStack Controller.
Because Swift is a distributed system that is controlled by software, does not rely on RAID, and writes multiple copies of each object (file), operating a Swift cluster is fundamentally different from operating traditional storage systems such as storage area networks (SAN) or using network-attached storage (NAS) equipment.
When dealing with SANs and NASes, if a disk dies, the operator should make sure the bad disk is replaced right away to ensure that the RAID is rebuilt and is returned to full parity in the shortest time possible. With Swift, if a disk or even an entire node goes bad, it usually isn’t a huge problem. The only time losing a drive or node can present issues is when a Swift cluster is small and very full. For example, if you have a three-node cluster that is 90% full, ...