The solution options are as follows:
- Establish alternatives to detect exceptions and automatically initiate a system failover, or redirect the request to a standby system. Implement code that leverages alternate nodes when a failed request is detected from an existing application.
- Design for instrumentation, such as events or performance counters, that detects performance problems or external system failures and exposes information through standard interfaces, such as, WMI, trace files, and event logs. Log performance, errors, exceptions and auditing information about calls made to other systems and services.
- Establish alternatives to manage unreliable application, failed APIs, and failed transactions. Identify queuing pending ...