Site Reliability Engineering on AWS

Video description

Implement a reliable application architecture using the patterns and best practices recommended by AWS.

About This Video

  • Understand the core principles behind building reliable applications and how AWS helps to support these
  • Take a Python application and architect it for reliability using AWS services
  • Deploy a globally accessible, fault-tolerant web application on the AWS cloud by employing a combination of infrastructure and application resilience patterns

In Detail

Reliability in AWS includes the ability of a system to recover from infrastructure or service disruptions. It's essential to acquire computing resources to meet the demand, and mitigate disruptions such as configuration issues or transient network problems.

In this course, you will first explore the key concepts and core services of AWS and Site Reliability Engineering (SRE). We show you step-by-step how to implement a real-world application that is built via the reliability principles defined within the AWS Well-Architected Framework using the SRE approach. So you can increase the reliability of application architectures on AWS by implementing resilience infrastructure and application resilience.

You will be covering some common architectural patterns used every day by real-world AWS solution architects to build reliable systems and implement fault tolerance into an application architecture running on AWS. While learning how to further increase the reliability of application architectures on AWS by implementing multi-region solutions for disaster recovery on a global scale.

By the end of this course, you will have gained a variety of AWS architecture skills that you can then apply to the real world.

Publisher resources

Download Example Code

Table of contents

  1. Chapter 1 : The Basics of Site Reliability Engineering
    1. Course Overview 00:03:43
    2. Reliability in Modern Applications 00:06:47
    3. The Impact of Failure and Determining Your Reliability Objectives 00:06:41
    4. Accepting Failure and Making It Part of the Design Process 00:05:12
    5. SRE is a Mindset 00:05:04
  2. Chapter 2 : Gaining Resilience and Reliability On AWS
    1. AWS Global, Regional, and Zonal Architecture Design 00:12:19
    2. Amazon's Global Storage Services - S3 00:04:03
    3. Running Resilient Databases On AWS - RDS and DynamoDB 00:07:38
    4. Fault Tolerant Computation On AWS - Lambda and EC2 00:07:18
    5. Core Resilience Principles for AWS - Load Balancing and Auto Scaling 00:06:37
    6. Using Kubernetes and ECS On AWS 00:14:02
  3. Chapter 3 : Accepting Failure In Multi-Tier Applications
    1. Typical Three-Tier Application Resilience and Why It Fails in Cloud 00:06:49
    2. Designing In Resilience With Microservices 00:09:01
    3. Managing State 00:09:32
    4. Typical Application Reliability Patterns 00:09:12
    5. The Architecture of Our Example Microservices 00:04:38
  4. Chapter 4 : Deploying Py-Simple On AWS
    1. Optimizing and Migrating Our Code 00:14:24
    2. Creating Our Container with CodeBuild 00:09:24
    3. Deploying ECS and RDS 00:05:12
    4. Deploying and Testing Our Py-Simple Application 00:13:14
    5. The Problem with What We've Just Built 00:04:09
  5. Chapter 5 : Designing Py-Global
    1. The Architecture of Py-Global and Failure Mode Analysis 00:07:27
    2. Multi-Regional Support 00:10:20
    3. Microservices Design 00:05:36
    4. Authentication and Authorization 00:11:09
    5. Code Deployment with CodePipeline 00:08:01
    6. Application Telemetry and Tracing 00:07:26
    7. Application Analytics 00:05:28
    8. Aurora and its Advantages Over MySQL 00:05:32
  6. Chapter 6 : Deploying a Resilient, Fault Tolerant Py-Global Application
    1. Running/Scaling Our Application On EKS 00:12:15
    2. Creating a Resilient and Reliable Data Store for Python with Amazon Aurora 00:04:31
    3. Deploying App-Mesh 00:12:21
  7. Chapter 7 : Surviving Failure of a Global Scale
    1. Review: AWS Global Architecture and What We Have Just Built 00:05:09
    2. Global Tools: Route 53, CloudFront 00:06:11
    3. Going Global: What Does This Mean For Your Users/Developers 00:03:50
    4. Operational Changes Required For a Global Application 00:06:16
    5. Course Summary 00:04:37

Product information

  • Title: Site Reliability Engineering on AWS
  • Author(s): Malcolm Orr
  • Release date: June 2020
  • Publisher(s): Packt Publishing
  • ISBN: 9781800205970