Modern System Administration

Book description

Early system administration required in-depth knowledge of a variety of services on individual systems. Now, the job is increasingly complex and different from one company to the next with an ever-growing list of technologies and third-party services to integrate. How does any one individual stay relevant in systems and services? This practical guide helps anyone in operations—sysadmins, automation engineers, IT professionals, and site reliability engineers—understand the essential concepts of the role today.

Collaboration, automation, and the evolution of systems change the fundamentals of operations work. No matter where you are in your journey, this book provides you the information to craft your path to advancing essential system administration skills. Author Jennifer Davis provides examples of modern practices and tools with recommended materials to advance your skills.

Topics include:

    Development and testing: Version control, fundamentals of virtualization and containers, testing, and architecture reviewDeploying and configuring services: Infrastructure management, networks, security, storage, serverless, and release managementScaling administration: Monitoring and observability, capacity planning, log management and analysis, and security and compliance

Publisher resources

View/Submit Errata

Table of contents

  1. Preface
    1. Conventions Used in This Book
    2. Using Code Examples
    3. O’Reilly
    4. How to Contact Us
    5. Acknowledgments
  2. I. Foundations
  3. 1. Introduction
    1. Principles
    2. Modernization of Compute, Network and Storage
      1. Compute
      2. Network
      3. Storage
    3. Infrastructure Management
    4. Scaling Production Readiness
    5. A Role by any Other Name
      1. DevOps
      2. Site Reliability Engineering (SRE)
      3. How do Devops and SRE Differ?
      4. System Administrator
    6. Finding Your Next Opportunity
  4. 2. Infrastructure Strategy
    1. Understanding Infrastructure Lifecycle
      1. Lifecycle of Physical Hardware
      2. Lifecycle of Cloud Services
      3. Challenges to Planning Infrastructure Strategy
    2. Infrastructure Stacks
    3. Infrastructure as Code
    4. Wrapping Up
  5. II. Principles
  6. 3. Version Control
    1. Fundamentals of Git
      1. Branching
    2. Working with Remote Git Repositories
    3. Resolving Conflicts
    4. Fixing Your Local Repository
    5. Advancing Collaboration with Version Control
    6. Wrapping Up
  7. 4. Local Development Environments
    1. Choosing an Editor
      1. Minimizing required mouse usage
      2. Integrated Static Code Analysis
      3. Easing editing through auto completion
      4. Indenting code to match team conventions
      5. Collaborating while editing
      6. Integrating workflow with git
      7. Extending the development environment
    2. Selecting Languages to Install
    3. Installing and Configuring Applications
    4. Wrapping Up
  8. 5. Testing
    1. Why should Sysadmins Write Tests?
    2. Differentiating the Types of Testing
      1. Linting
      2. Unit Tests
      3. Integration Tests
      4. End-to-End Tests
    3. Examining the Shape of Testing Strategy
    4. Existing Sysadmin Testing Activities
    5. When Tests Fail
      1. Environment Problem
      2. Flawed Test Logic
      3. Assumptions Changed
      4. Code Defects
      5. Failures in Test Strategy
    6. Flaky Tests
    7. Wrapping Up
  9. 6. Security
    1. Collaboration in Security
    2. Borrow the Attacker Lens
    3. Design for Security Operability
      1. Qualifying Issues
    4. Wrapping Up
  10. III. Principles in Practice
  11. 7. Infracode
    1. Building Machine Images
      1. Building with Packer
      2. Building With Docker
    2. Provisioning Infrastructure Resources
    3. Provisioning with Terraform
    4. Configuring Infrastructure Resources
      1. Configuring with Chef
    5. Getting Started with Infracode
    6. Wrapping Up
  12. 8. Testing in Practice
    1. Writing Unit Tests for Infracode
      1. Writing Unit Tests with Chefspec
      2. Writing Unit Tests for Datadog Install Recipe
    2. Writing Integration Tests for Infracode
      1. Writing Integration Tests for Datadog Install Recipe
      2. Linting Chef Code with Rubocop and Foodcritic
    3. Wrapping Up
  13. 9. Security and Infracode
    1. Managing Identity and Access
      1. How should you control access to your system?
      2. Who should have access to your system?
    2. Managing Secrets
      1. Password Managers and Secret Management Software
      2. Defending Secrets and Monitoring Usage
    3. Securing Compute Infrastructure
    4. Managing Networking
    5. Recommendations for your Security Infracode
  14. IV. Scaling Production Readiness
  15. 10. Monitoring Theory
    1. Why Monitor?
    2. How Monitoring and Observability Differ?
    3. Monitoring Building Blocks
      1. Events
      2. Monitors
      3. Data: Metrics, Logs, and Tracing
    4. What does Monitoring look like?
      1. Event Detection
      2. Data Collection
      3. Data Reduction
      4. Data Analysis
      5. Data Presentation
    5. Monitoring for Sustainable Work
  16. 11. Presenting Information
    1. Know your audience
    2. Choosing your channel
    3. Choose your story type
    4. Presenting Data in Action
      1. Charts are Worth A Thousand Words.
      2. Telling the Same Story With a Different Audience
      3. The Key Takeaway
    5. Know your visuals
      1. Visual Cues
      2. Chart types
    6. Recommended Visualization Practices
  17. 12. Developing On-Call Resilience
    1. What Is On-Call?
    2. Humane On-Call Processes
      1. Preparing for On-Call
      2. One Week Out
      3. The Night Before
      4. Your On-Call Rotation
      5. On-Call Handoff
      6. The Day After On-Call
    3. Monitor the On-Call Experience
    4. Wrapping Up
  18. 13. Managing Incidents
    1. What is an Incident?
    2. What is Incident Management?
      1. Roles and Responsibilities
    3. Pre-emptive Planning
    4. Handling the Incident
    5. Post-incident meeting
    6. Practice Failure

Product information

  • Title: Modern System Administration
  • Author(s): Jennifer Davis
  • Release date: May 2022
  • Publisher(s): O'Reilly Media, Inc.
  • ISBN: 9781492055211