No matter how hard I try, I have yet to write any significant block of code that does not contain any errors. Nor have I been very good at predicting the wide range of crazy things users do with my applications. Why would anybody click on that link 73 times in a row? I'll never know.
Dealing with failures in a messaging scenario is very easy. The core of the failure strategy is to embrace errors. We have exceptions for a reason and to spend all of our time trying to predict and catch exceptions is counterproductive. You'll invariably spend time building in catches for errors that never happen and miss errors that happen frequently.
In an asynchronous system, errors need not be handled as soon as they occur. Instead, the message ...