Database Internals

Book description

None

Table of contents

  1. Preface
    1. How to Contact Us
  2. I. Storage Engines
  3. 1. Introduction and Overview
    1. DBMS Architecture
    2. Memory- Versus Disk-Based DBMS
      1. Durability in Memory-Based Stores
    3. Column- Versus Row-Oriented DBMS
      1. Row-Oriented Data Layout
      2. Column-Oriented Data Layout
      3. Distinctions and Optimizations
      4. Wide Column Stores
    4. Data Files and Index Files
      1. Data Files
      2. Index Files
      3. Primary Index as an Indirection
    5. Buffering, Immutability, and Ordering
    6. Summary
  4. 2. B-Tree Basics
    1. Binary Search Trees
      1. Tree Balancing
      2. Trees for Disk-Based Storage
    2. Disk-Based Structures
      1. Hard Disk Drives
      2. Solid State Drives
      3. On-Disk Structures
    3. Ubiquitous B-Trees
      1. B-Tree Hierarchy
      2. Separator Keys
      3. B-Tree Lookup Complexity
      4. B-Tree Lookup Algorithm
      5. Counting Keys
      6. B-Tree Node Splits
      7. B-Tree Node Merges
    4. Summary
  5. 3. File Formats
    1. Motivation
    2. Binary Encoding
      1. Primitive Types
      2. Strings and Variable-Size Data
      3. Bit-Packed Data: Booleans, Enums, and Flags
    3. General Principles
    4. Page Structure
    5. Slotted Pages
    6. Cell Layout
    7. Combining Cells into Slotted Pages
    8. Managing Variable-Size Data
    9. Versioning
    10. Checksumming
    11. Summary
  6. 4. Implementing B-Trees
    1. Page Header
      1. Magic Numbers
      2. Sibling Links
      3. Rightmost Pointers
      4. Node High Keys
      5. Overflow Pages
    2. Binary Search
      1. Binary Search with Indirection Pointers
    3. Propagating Splits and Merges
      1. Breadcrumbs
    4. Rebalancing
    5. Right-Only Appends
      1. Bulk Loading
    6. Compression
    7. Vacuum and Maintenance
      1. Fragmentation Caused by Updates and Deletes
      2. Page Defragmentation
    8. Summary
  7. 5. Transaction Processing and Recovery
    1. Buffer Management
      1. Caching Semantics
      2. Cache Eviction
      3. Locking Pages in Cache
      4. Page Replacement
    2. Recovery
      1. Log Semantics
      2. Operation Versus Data Log
      3. Steal and Force Policies
      4. ARIES
    3. Concurrency Control
      1. Serializability
      2. Transaction Isolation
      3. Read and Write Anomalies
      4. Isolation Levels
      5. Optimistic Concurrency Control
      6. Multiversion Concurrency Control
      7. Pessimistic Concurrency Control
      8. Lock-Based Concurrency Control
    4. Summary
  8. 6. B-Tree Variants
    1. Copy-on-Write
      1. Implementing Copy-on-Write: LMDB
    2. Abstracting Node Updates
    3. Lazy B-Trees
      1. WiredTiger
      2. Lazy-Adaptive Tree
    4. FD-Trees
      1. Fractional Cascading
      2. Logarithmic Runs
    5. Bw-Trees
      1. Update Chains
      2. Taming Concurrency with Compare-and-Swap
      3. Structural Modification Operations
      4. Consolidation and Garbage Collection
    6. Cache-Oblivious B-Trees
      1. van Emde Boas Layout
    7. Summary
  9. 7. Log-Structured Storage
    1. LSM Trees
      1. LSM Tree Structure
      2. Updates and Deletes
      3. LSM Tree Lookups
      4. Merge-Iteration
      5. Reconciliation
      6. Maintenance in LSM Trees
    2. Read, Write, and Space Amplification
      1. RUM Conjecture
    3. Implementation Details
      1. Sorted String Tables
      2. Bloom Filters
      3. Skiplist
      4. Disk Access
      5. Compression
    4. Unordered LSM Storage
      1. Bitcask
      2. WiscKey
    5. Concurrency in LSM Trees
    6. Log Stacking
      1. Flash Translation Layer
      2. Filesystem Logging
    7. LLAMA and Mindful Stacking
      1. Open-Channel SSDs
    8. Summary
  10. Part I Conclusion
  11. II. Distributed Systems
  12. 8. Introduction and Overview
    1. Concurrent Execution
      1. Shared State in a Distributed System
    2. Fallacies of Distributed Computing
      1. Processing
      2. Clocks and Time
      3. State Consistency
      4. Local and Remote Execution
      5. Need to Handle Failures
      6. Network Partitions and Partial Failures
      7. Cascading Failures
    3. Distributed Systems Abstractions
      1. Links
    4. Two Generals’ Problem
    5. FLP Impossibility
    6. System Synchrony
    7. Failure Models
      1. Crash Faults
      2. Omission Faults
      3. Arbitrary Faults
      4. Handling Failures
      5. Summary
  13. 9. Failure Detection
    1. Heartbeats and Pings
      1. Timeout-Free Failure Detector
      2. Outsourced Heartbeats
    2. Phi-Accrual Failure Detector
    3. Gossip and Failure Detection
    4. Reversing Failure Detection Problem Statement
    5. Summary
  14. 10. Leader Election
    1. Bully Algorithm
    2. Next-In-Line Failover
    3. Candidate/Ordinary Optimization
    4. Invitation Algorithm
    5. Ring Algorithm
    6. Summary
  15. 11. Replication and Consistency
    1. Achieving Availability
    2. Infamous CAP
      1. Use CAP Carefully
      2. Harvest and Yield
    3. Shared Memory
    4. Ordering
    5. Consistency Models
      1. Strict Consistency
      2. Linearizability
      3. Sequential Consistency
      4. Causal Consistency
    6. Session Models
    7. Eventual Consistency
    8. Tunable Consistency
    9. Witness Replicas
    10. Strong Eventual Consistency and CRDTs
    11. Summary
  16. 12. Anti-Entropy and Dissemination
    1. Read Repair
    2. Digest Reads
    3. Hinted Handoff
    4. Merkle Trees
    5. Bitmap Version Vectors
    6. Gossip Dissemination
      1. Gossip Mechanics
      2. Overlay Networks
      3. Hybrid Gossip
      4. Partial Views
    7. Summary
  17. 13. Distributed Transactions
    1. Making Operations Appear Atomic
    2. Two-Phase Commit
      1. Cohort Failures in 2PC
      2. Coordinator Failures in 2PC
    3. Three-Phase Commit
      1. Coordinator Failures in 3PC
    4. Distributed Transactions with Calvin
    5. Distributed Transactions with Spanner
    6. Database Partitioning
      1. Consistent Hashing
    7. Distributed Transactions with Percolator
    8. Coordination Avoidance
    9. Summary
  18. 14. Consensus
    1. Broadcast
    2. Atomic Broadcast
      1. Virtual Synchrony
      2. Zookeeper Atomic Broadcast (ZAB)
    3. Paxos
      1. Paxos Algorithm
      2. Quorums in Paxos
      3. Failure Scenarios
      4. Multi-Paxos
      5. Fast Paxos
      6. Egalitarian Paxos
      7. Flexible Paxos
      8. Generalized Solution to Consensus
    4. Raft
      1. Leader Role in Raft
      2. Failure Scenarios
    5. Byzantine Consensus
      1. PBFT Algorithm
      2. Recovery and Checkpointing
    6. Summary
  19. Part II Conclusion
  20. A. Bibliography
  21. Index

Product information

  • Title: Database Internals
  • Author(s):
  • Release date:
  • Publisher(s): O'Reilly Media, Inc.
  • ISBN: None