Resiliency is the ability of a system to gracefully handle and recover from failures. It is one of the most important factors when designing services so that it can recover either its high load, or the failure of internal or external components, during any condition:
- Building services: In microservices and implementing microservice design principle discussed in this book.
- Retry logic: It helps an application to handle anticipated, temporary failures if any endpoint or transaction fails, and it helps to retry the same transactions to recover from a specific issue.
- Supervisor agent: Installs a supervisor-like service that continuously monitors your application/daemons/services and restarts them if they fail. SupervisorD ...