O'Reilly logo

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Learning Ceph - Second Edition

Book Description

Implement and manage your software-defined, massively scalable storage system

About This Book

  • Explore Ceph's architecture in order to achieve scalability and high availability
  • Learn to utilize Ceph efficiently with the help of practical examples
  • Successfully implement Ceph clusters to scale-out storage solutions along with outstanding data protection

Who This Book Is For

A basic knowledge of GNU/Linux, and storage systems, and server components is assumed. If you have no experience of software-defined storage solutions and Ceph, but are eager to learn about them, this is the book for you.

What You Will Learn

  • The limitations of existing systems and why you should use Ceph as a storage solution
  • Familiarity with Ceph's architecture, components, and services
  • Instant deployment and testing of Ceph within a Vagrant and VirtualBox environment
  • Ceph operations including maintenance, monitoring, and troubleshooting
  • Storage provisioning of Ceph's block, object, and filesystem services
  • Integrate Ceph with OpenStack
  • Advanced topics including erasure coding, CRUSH maps, and performance tuning
  • Best practices for your Ceph clusters

In Detail

Learning Ceph, Second Edition will give you all the skills you need to plan, deploy, and effectively manage your Ceph cluster. You will begin with the first module, where you will be introduced to Ceph use cases, its architecture, and core projects. In the next module, you will learn to set up a test cluster, using Ceph clusters and hardware selection. After you have learned to use Ceph clusters, the next module will teach you how to monitor cluster health, improve performance, and troubleshoot any issues that arise. In the last module, you will learn to integrate Ceph with other tools such as OpenStack, Glance, Manila, Swift, and Cinder.

By the end of the book you will have learned to use Ceph effectively for your data storage requirements.

Style and approach

This step-by-step guide, including use cases and examples, not only helps you to easily use Ceph but also demonstrates how you can use it to solve any of your server or drive storage issues.

Downloading the example code for this book. You can download the example code files for all Packt books you have purchased from your account at http://www.PacktPub.com. If you purchased this book elsewhere, you can visit http://www.PacktPub.com/support and register to have the code file.

Table of Contents

  1. Preface
    1. What this book covers
    2. What you need for this book
    3. Who this book is for
    4. Conventions
    5. Reader feedback
    6. Customer support
      1. Downloading the example code
      2. Downloading the color images of this book
      3. Errata
      4. Piracy
      5. Questions
  2. Introducing Ceph Storage
    1. The history and evolution of Ceph
      1. Ceph releases
    2. New since the first edition
    3. The future of storage
      1. Ceph as the cloud storage solution
      2. Ceph is software-defined
      3. Ceph is a unified storage solution
      4. The next-generation architecture
      5. RAID: the end of an era
      6. Ceph Block Storage
    4. Ceph compared to other storage solutions
      1. GPFS
      2. iRODS
      3. HDFS
      4. Lustre
      5. Gluster
      6. Ceph
    5. Summary
  3. Ceph Components and Services
    1. Introduction
    2. Core components
      1. Reliable Autonomic Distributed Object Store (RADOS)
      2. MONs
      3. Object Storage Daemons (OSDs)
      4. Ceph manager
      5. RADOS GateWay (RGW)
      6. Admin host
      7. CephFS MetaData server (MDS)
      8. The community
    3. Core services
      1. RADOS Block Device (RBD)
      2. RADOS Gateway (RGW)
      3. CephFS
      4. Librados
    4. Summary
  4. Hardware and Network Selection
    1. Introduction
    2. Hardware selection criteria
      1. Corporate procurement policies
      2. Power requirements-amps, volts, and outlets
      3. Compatibility with management infrastructure
      4. Compatibility with physical infrastructure
      5. Configuring options for one-stop shopping
    3. Memory
      1. RAM capacity and speed
    4. Storage drives
      1. Storage drive capacity
      2. Storage drive form factor
      3. Storage drive durability and speed
      4. Storage drive type
      5. Number of storage drive bays per chassis
    5. Controllers
      1. Storage HBA / controller type
      2. Networking options
      3. Network versus serial versus KVM management
      4. Adapter slots
    6. Processors
      1. CPU socket count
      2. CPU model
      3. Emerging technologies
    7. Summary
  5. Planning Your Deployment
    1. Layout decisions
      1. Convergence: Wisdom or Hype?
      2. Planning Ceph component servers
      3. Rack strategy
      4. Server naming
    2. Architectural decisions
      1. Pool decisions
        1. Replication
        2. Erasure Coding
        3. Placement Group calculations
      2. OSD decisions
        1. Back end: FileStore or BlueStore?
        2. OSD device strategy
        3. Journals
        4. Filesystem
        5. Encryption
    3. Operating system decisions
      1. Kernel and operating system
        1. Ceph packages
        2. Operating system deployment
        3. Time synchronization
        4. Packages
    4. Networking decisions
    5. Summary
  6. Deploying a Virtual Sandbox Cluster
    1. Installing prerequisites for our Sandbox environment
    2. Bootstrapping our Ceph cluster
    3. Deploying our Ceph cluster
    4. Scaling our Ceph cluster
    5. Summary
  7. Operations and Maintenance
    1. Topology
      1. The 40,000 foot view
      2. Drilling down
        1. OSD dump
        2. OSD list
        3. OSD find
        4. CRUSH dump
      3. Pools
      4. Monitors
      5. CephFS
    2. Configuration
      1. Cluster naming and configuration
      2. The Ceph configuration file
      3. Admin sockets
      4. Injection
      5. Configuration management
    3. Scrubs
    4. Logs
      1. MON logs
      2. OSD logs
      3. Debug levels
    5. Common tasks
      1. Installation
        1. Ceph-deploy
      2. Flags
      3. Service management
        1. Systemd: the wave (tsunami?) of the future
        2. Upstart
        3. sysvinit
      4. Component failures
      5. Expansion
      6. Balancing
      7. Upgrades
    6. Working with remote hands
    7. Summary
  8. Monitoring Ceph
    1. Monitoring Ceph clusters
      1. Ceph cluster health
      2. Watching cluster events
      3. Utilizing your cluster
      4. OSD variance and fillage
      5. Cluster status
      6. Cluster authentication
    2. Monitoring Ceph MONs
      1. MON status
      2. MON quorum status
    3. Monitoring Ceph OSDs
      1. OSD tree lookup
      2. OSD statistics
      3. OSD CRUSH map
    4. Monitoring Ceph placement groups
      1. PG states
    5. Monitoring Ceph MDS
    6. Open source dashboards and tools
      1. Kraken
      2. Ceph-dash
      3. Decapod
      4. Rook
      5. Calamari
      6. Ceph-mgr
      7. Prometheus and Grafana
    7. Summary
  9. Ceph Architecture: Under the Hood
    1. Objects
      1. Accessing objects
    2. Placement groups
    3. Setting PGs on pools
      1. PG peering
      2. PG Up and Acting sets
      3. PG states
    4. CRUSH
      1. The CRUSH Hierarchy
      2. CRUSH Lookup
      3. Backfill, Recovery, and Rebalancing
      4. Customizing CRUSH
    5. Ceph pools
      1. Pool operations
      2. Creating and listing pools
      3. Ceph data flow
      4. Erasure coding
    6. Summary
  10. Storage Provisioning with Ceph
    1. Client Services
    2. Ceph Block Device (RADOS Block Device)
      1. Creating and Provisioning RADOS Block Devices
      2. Resizing RADOS Block Devices
      3. RADOS Block Device Snapshots
      4. RADOS Block Device Clones
    3. The Ceph Filesystem (CephFS)
      1. CephFS with Kernel Driver
      2. CephFS with the FUSE Driver
    4. Ceph Object Storage (RADOS Gateway)
      1. Configuration for the RGW Service
      2. Performing S3 Object Operations Using s3cmd
      3. Enabling the Swift API
      4. Performing Object Operations using the Swift API
    5. Summary
  11. Integrating Ceph with OpenStack
    1. Introduction to OpenStack
      1. Nova
      2. Glance
      3. Cinder
      4. Swift
      5. Ganesha / Manila
      6. Horizon
      7. Keystone
    2. The Best Choice for OpenStack storage
      1. Integrating Ceph and OpenStack
      2. Guest Operating System Presentation
    3. Virtual OpenStack Deployment
    4. Summary
  12. Performance and Stability Tuning
    1. Ceph performance overview
    2. Kernel settings
      1. pid_max
      2. kernel.threads-max, vm.max_map_count
      3. XFS filesystem settings
      4. Virtual memory settings
    3. Network settings
      1. Jumbo frames
      2. TCP and network core
      3. iptables and nf_conntrack
    4. Ceph settings
      1. max_open_files
      2. Recovery
      3. OSD and FileStore settings
      4. MON settings
    5. Client settings
    6. Benchmarking
      1. RADOS bench
      2. CBT
      3. FIO
        1. Fill volume, then random 1M writes for 96 hours, no read verification:
        2. Fill volume, then small block writes for 96 hours, no read verification:
        3. Fill volume, then 4k random writes for 96 hours, occasional read verification:
    7. Summary