Transient fault handling

Transient faults are temporary conditions that cause a failure, such as a momentary loss of network connectivity or a service timeout due to overload. Distributed systems composed of multiple services interacting over network are much more prone to transient faults than monolith applications.

Transient faults are, in most cases, self-correcting, and thus a subsequent retry of the failed operation is likely to succeed. The main challenge with retries, however, is that there is no easy way to distinguish between a transient and non-transient fault, and thus in the case of a non-transient fault, indefinite retries must be prevented.

The following practices help address transient fault handling when implemented in concert: ...

Get Serverless computing in Azure with .NET now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.