Practical Site Reliability Engineering

Book description

Create, deploy, and manage applications at scale using SRE principles

Key Features

  • Build and run highly available, scalable, and secure software
  • Explore abstract SRE in a simplified and streamlined way
  • Enhance the reliability of cloud environments through SRE enhancements

Book Description

Site reliability engineering (SRE) is being touted as the most competent paradigm in establishing and ensuring next-generation high-quality software solutions.

This book starts by introducing you to the SRE paradigm and covers the need for highly reliable IT platforms and infrastructures. As you make your way through the next set of chapters, you will learn to develop microservices using Spring Boot and make use of RESTful frameworks. You will also learn about GitHub for deployment, containerization, and Docker containers. Practical Site Reliability Engineering teaches you to set up and sustain containerized cloud environments, and also covers architectural and design patterns and reliability implementation techniques such as reactive programming, and languages such as Ballerina and Rust. In the concluding chapters, you will get well-versed with service mesh solutions such as Istio and Linkerd, and understand service resilience test practices, API gateways, and edge/fog computing.

By the end of this book, you will have gained experience on working with SRE concepts and be able to deliver highly reliable apps and services.

What you will learn

  • Understand how to achieve your SRE goals
  • Grasp Docker-enabled containerization concepts
  • Leverage enterprise DevOps capabilities and Microservices architecture (MSA)
  • Get to grips with the service mesh concept and frameworks such as Istio and Linkerd
  • Discover best practices for performance and resiliency
  • Follow software reliability prediction approaches and enable patterns
  • Understand Kubernetes for container and cloud orchestration
  • Explore the end-to-end software engineering process for the containerized world

Who this book is for

Practical Site Reliability Engineering helps software developers, IT professionals, DevOps engineers, performance specialists, and system engineers understand how the emerging domain of SRE comes handy in automating and accelerating the process of designing, developing, debugging, and deploying highly reliable applications and services.

Publisher resources

Download Example Code

Table of contents

  1. Title Page
  2. Copyright and Credits
    1. Practical Site Reliability Engineering
  3. Dedication
  4. About Packt
    1. Why subscribe?
  5. Contributors
    1. About the authors
    2. About the reviewers
    3. Packt is searching for authors like you
  6. Preface
    1. Who this book is for
    2. What this book covers
    3. To get the most out of this book
      1. Download the example code files
      2. Download the color images
      3. Conventions used
    4. Get in touch
      1. Reviews
  7. Demystifying the Site Reliability Engineering Paradigm
    1. Setting the context for practical SRE
      1. Characterizing the next-generation software systems 
      2. Characterizing the next-generation hardware systems 
      3. Moving toward hybrid IT and distributed computing
      4. Envisioning the digital era
      5. The cloud service paradigm
        1. The ubiquity of cloud platforms and infrastructures 
      6. The growing software penetration and participation
    2. Plunging into the SRE discipline
      1. The challenges ahead
    3. The need for highly reliable platforms and infrastructures 
      1. The need for reliable software
        1. The emergence of microservices architecture 
        2. Docker enabled containerization
        3. Containerized microservices
        4. Kubernetes for container orchestration
        5. Resilient microservices and reliable applications
    4. Reactive systems 
      1. Reactive systems are highly reliable
      2. The elasticity of reactive systems
    5. Highly reliable IT infrastructures
      1. The emergence of serverless computing
    6. The vitality of the SRE domain
      1. The importance of SREs 
      2. Toolsets that SREs typically use
    7. Summary
  8. Microservices Architecture and Containers
    1. What are microservices?
    2. Microservice design principles
    3. Deploying microservices
      1. Container platform-based deployment tools
      2. Code as function deployment
        1. Programming language selection criteria in AWS Lambda
      3. Virtualization-based platform deployment
    4. Practical examples of microservice deployment
      1. A container platform deployment example with Kubernetes
      2. Code as function deployment 
        1. Example 1 – the Apex deployment tool
        2. Example 2 – the Apex deployment tool
        3. Example 3 – the Serverless deployment tool 
      3. Virtual platform-based deployment using Jenkins or TeamCity
    5. Microservices using Spring Boot and the RESTful framework
    6. Jersey Framework
    7. Representational State Transfer (REST)
      1. Deploying the Spring Boot application
      2. Monitoring the microservices
        1. Application metrics
        2. Platform metrics
        3. System events
      3. Tools to monitor microservices
    8. Important facts about microservices
      1. Microservices in the current market
      2. When to stop designing microservices
      3. Can the microservice format be used to divide teams into small or micro teams? 
      4. Microservices versus SOA
    9. Summary
  9. Microservice Resiliency Patterns
    1. Briefing microservices and containers
      1. The containerization paradigm
    2. IT reliability challenges and solution approaches
    3. The promising and potential approaches for resiliency and reliability
      1. MSA is the prominent way forward
      2. Integrated platforms are the need of the hour for resiliency
    4. Summary
  10. DevOps as a Service
    1. What is DaaS?
      1. Selecting tools isn't easy
      2. Types of services under DaaS
        1. An example of one-click deployment and rollback 
      3. Configuring automated alerts
      4. Centralized log management
      5. Infrastructure security
      6. Continuous process and infrastructure development
      7. CI and CD
        1. CI life cycle
        2. CI tools
          1. Installing Jenkins
          2. Jenkins setup for GitHub
          3. Setting up the Jenkins job
          4. Installing Git
          5. Starting the Jenkins job
      8. CD
    2. Collaboration with development and QA teams
      1. The role of developers in DevOps
      2. The role of QA teams in DevOps
        1. QA practices     
    3. Summary
  11. Container Cluster and Orchestration Platforms
    1. Resilient microservices 
    2. Application and volume containers 
    3. Clustering and managing containers
      1. What are clusters?
    4. Container orchestration and management
      1. What is container orchestration?
    5. Summary
  12. Architectural and Design Patterns
    1. Architecture pattern
    2. Design pattern
      1. Design pattern for security
      2. Design pattern for resiliency
      3. Design pattern for scalability
      4. Design pattern for performance
      5. Design principles for availability
      6. Design principles for reliability
      7. Design patterns – circuit breaker
        1. Advantages of circuit breakers
          1. Closed state 
          2. Open state 
          3. Half-open state 
    3. Summary
  13. Reliability Implementation Techniques
    1. Ballerina programming 
      1. A hello program example
      2. A simple example with Twitter integration 
      3. Kubernetes deployment code
      4. A circuit breaker code example
      5. Ballerina data types
      6. Control logic expression
      7. The building blocks of Ballerina
      8. Ballerina command cheat sheet
    2. Reliability
    3. Rust programming
      1. Installing Rust
      2. Concept of Rust programming
        1. The ownership of variables in Rust
        2. Borrowing values in Rust
        3. Memory management in Rust
        4. Mutability in Rust
        5. Concurrency in Rust
        6. Error-handling in Rust
      3. The future of Rust programming
    4. Summary
  14. Realizing Reliable Systems - the Best Practices
    1. Reliable IT systems – the emerging traits and tips
    2. MSA for reliable software
      1. The accelerated adoption of containers and orchestration platforms
        1. The emergence of containerized clouds
    3. Service mesh solutions
    4. Microservices design – best practices
      1. The relevance of event-driven microservices
      2. Why asynchronous communication? 
      3. Why event-driven microservices? 
    5. Asynchronous messaging patterns for event-driven microservices
    6. The role of EDA to produce reactive applications 
      1. Command query responsibility segregation pattern
    7. Reliable IT infrastructures
      1. High availability
      2. Auto-scaling
    8. Infrastructure as code 
    9. Summary
  15. Service Resiliency
    1. Delineating the containerization paradigm
      1. Why use containerization? 
    2. Demystifying microservices architecture 
    3. Decoding the growing role of Kubernetes for the container era
    4. Describing the service mesh concept
      1. Data plane versus control plane summary
    5. Why is service mesh paramount?
    6. Service mesh architectures
      1. Monitoring the service mesh
      2. Service mesh deployment models
    7. Summary
  16. Containers, Kubernetes, and Istio Monitoring
    1. Prometheus
      1. Prometheus architecture
      2. Setting up Prometheus
      3. Configuring alerts in Prometheus
    2. Grafana
      1. Setting up Grafana
      2. Configuring alerts in Grafana
    3. Summary
  17. Post-Production Activities for Ensuring and Enhancing IT Reliability
    1. Modern IT infrastructure
      1. Elaborating the modern data analytics methods
    2. Monitoring clouds, clusters, and containers
      1. The emergence of Kubernetes 
    3. Cloud infrastructure and application monitoring
    4. The monitoring tool capabilities
      1. The benefits
    5. Prognostic, predictive, and prescriptive analytics
      1. Machine-learning algorithms for infrastructure automation
    6. Log analytics
      1. Open source log analytics platforms
      2. Cloud-based log analytics platforms
      3. AI-enabled log analytics platforms
      4. Loom
      5. Enterprise-class log analytics platforms 
      6. The key capabilities of log analytics platforms
      7. Centralized log-management tools
    7. IT operational analytics 
    8. IT performance and scalability analytics
    9. IT security analytics
    10. The importance of root-cause analysis 
      1. OverOps enhances log-management
    11. Summary
    12. Further Readings
  18. Service Meshes and Container Orchestration Platforms
    1. About the digital transformation
    2. Cloud-native and enabled applications for the digital era
    3. Service mesh solutions
      1. Linkerd
      2. Istio
        1. Visualizing an Istio service mesh
    4. Microservice API Gateway
      1. The benefits of an API Gateway for microservices-centric applications
      2. Security features of API Gateways
      3. API Gateway and service mesh in action
      4. API management suite
    5. Ensuring the reliability of containerized cloud environments
    6. The journey toward containerized cloud environments
    7. The growing solidity of the Kubernetes platform for containerized clouds
      1. Kubernetes architecture – how it works
      2. Installing the Kubernetes platform
      3. Installing the Kubernetes client
      4. Installing Istio on Kubernetes
        1. Trying the application
        2. Deploying services to Kubernetes
    8. Summary
  19. Other Books You May Enjoy
    1. Leave a review - let other readers know what you think

Product information

  • Title: Practical Site Reliability Engineering
  • Author(s): Pethuru Raj Chelliah, Shreyash Naithani, Shailender Singh
  • Release date: November 2018
  • Publisher(s): Packt Publishing
  • ISBN: 9781788839563