Handling failure in AKS

Kubernetes is a distributed system with many hidden working parts. AKS abstracts all of it for us, but it is still our responsibility to know where to look and how to respond when bad things happen. Much of the failure handling is done automatically by Kubernetes – still, you will run into situations where manual intervention is required. The following is a list of the most common failure modes that require interaction. We will look into the following failure modes in depth in this section:

  • Node failures
  • Out-of-resource failure
  • Storage mount issues
  • Network issues
See Kubernetes the Hard Way (https://github.com/kelseyhightower/kubernetes-the-hard-way), an excellent tutorial, to get an idea about the blocks on which ...

Get Hands-On Kubernetes on Azure now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.