Chapter 4. Integrity and security 75
The two-phase commit protocol, as shown in Figure 4-3, is a set of actions used
to make sure that an application program either makes
all the changes to the
resources represented by a single unit of recovery (UR), or makes
no changes at
all. This protocol verifies that either all changes or no changes are applied even if
one of the elements (such as the application, the system, or the resource
manager) fails. The protocol allows for restart and recovery processing to take
place after system or subsystem failure.
The two-phase commit protocol is initiated when the application is ready to
commit or back out its changes. The application program that initiates the commit
or backout does not need to know who the sync point manager is or how
two-phase commit works. This is hidden from the application; a program simply
calls a commit or backout service and receives a return code indicating whether
this has been successfully completed.
When the application issues the commit request, the
coordinating recovery
manager
, also called the sync point manager, gives each resource manager
participating in the unit of recovery an opportunity to vote on whether its part of
the UR is in a consistent state and can be committed.
If all participants vote yes, the sync point manager instructs all the resource
managers to commit the changes. If any participants vote no, the sync point
manager instructs them to back out the changes.
4.2.5 Data backup and recovery
The z/OS operating system and its associated disk and tape subsystems provide
a robust suite of tools that provide for data backup utilities to meet a wide range
of data backup requirements, from disaster recovery to local recovery.
Installations need to choose the backup and recovery solution that best fits their
needs. The choice that an installation makes will be based on factors such as:
1. Is the backup being performed in order to recover from local outages, or for
disaster recovery?
2. Does the backup and recovery need to be device-independent?
3. What level of granularity is required: single data set, volume, application or
site?
4. What is my
recovery time objective? That is, how much time do I have to
recover the data after an outage?
5. What is my
recovery point objective? At what points in time do I make the
backup, so that when recovery is performed, the processing of the data can
resume where it left off at this chosen point in time.

Get Introduction to the New Mainframe: Large-Scale Commercial Computing now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.