Common problems – the availability of hardware resources

The hardware resources that our application needs might or might not be available at any given point in time. Moreover, even if some resources were to be available at some point in time, nothing guarantees that they will stay available for much longer. A problem we can face related to this is network glitches, which are quite common in many environments (especially for mobile apps) and which, for most practical purposes, are indistinguishable from machine or application crashes.

Applications using a distributed computing framework or job scheduler can often rely on the framework itself to handle at least some common failure scenarios. Some job schedulers will even resubmit our jobs in case ...

Get Distributed Computing with Python now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.