34 WebSphere Replication Server for z/OS Using Q Replication: High Availability Scenarios for the z/OS Platform
Figure 3-4 Overview of setup steps for the bidirectional replication environment
3.5 Failover and switchback considerations
This section provides a high-level overview of some of the considerations involved with
failover and switchback processing associated with bidirectional replication. Depending upon
the particular environment, the process involved in ensuring satisfactory failover and
switchback processing can get quite complex. The topics covered here are:
򐂰 Failover processing considerations
򐂰 Switchback processing considerations
򐂰 Considerations in choosing a particular switchback scenario
3.5.1 Failover processing considerations
When the failover occurs to the secondary server, it is possible for some of the changes that
occurred at the primary server not to be replicated over to the secondary server. These
changes may include changes in the DB2 log that had not as yet been sent to the WebSphere
MQ queue (item 1 in Figure 3-5), or messages in the WebSphere MQ queue that had not
been transmitted to the secondary server (item 2 in Figure 3-5). These un-replicated changes
should be considered to be “data loss” at the secondary server,
at least until the primary
server is restored
.
STEP SETBIDI7: Start Q Capture & Q Apply on primary & secondary servers
STEP SETBIDI6: Set up WebSphere MQ objects on primary & secondary servers
STEP SETBIDI5:
Create Q subscriptions for the test tables
STEP SETBIDI4: Create Q replication maps
STEP SETBIDI3:
Create Q replication control tables
STEP SETBIDI2: Create the test tables
STEP SETBIDI1: Catalog databases on the Replication Center workstation
STEP SETBIDI8: Activate subscriptions on primary
STEP SETBIDI10: Update monitor control tables on secondary
STEP SETBIDI9: Create monitor control tables on secondary
STEP SETBIDI12: Start monitor on secondary
STEP SETBIDI11: Bind monitor program on secondary
Chapter 3. Failover and switchback scenarios 35
If there are messages in the receive queue on the secondary server that have not been
drained (item 4 in Figure 3-5), when the secondary server is enabled for updates, then
conflicts may occur on the secondary server between the updates in its receive queue and
the updates occurring on the secondary server.
Figure 3-5 Data loss and potential sources of conflicts
When switchback occurs, then the changes from the secondary server that are replicated
over to the primary server will pass through the receive queue at the primary server (item 3 in
Figure 3-5). These changes may conflict with the changes that may have already occurred at
the primary server just prior to failure but were not replicated over to the secondary server
(items 1 and 2 in Figure 3-5). Additionally, during switchback, any unpropagated changes on
the DB2 log and transmit queue on the primary server (items 1 and 2 in Figure 3-5) may
conflict with changes that may have already occurred at the secondary server after failover
but prior to switchback.
Therefore, it is possible for conflicts to occur on both the primary server during switchback
and the secondary server after failover.
To support the ABC Financials business requirement of failover and switchback, the conflict
rule must be set for the primary server to be designated as the loser. Any conflicts will resolve
in favor of the updates originating at the secondary server, which would represent the most
recent changes to the data.
Note: Data loss refers to transactions that are not propagated and applied to the “target”
table. (Here “target” does not refer to the location of the table but the object of the
replication). Therefore, changes may have originated on the primary server prior to
failover, or originated on the secondary server after failover and prior to switchback.
Important: Identifying the data loss suffered can be critical for certain types of
applications.
Attention: The triggering event for failover is assumed to originate external to the Q
replication environment. If the switching of the update workload to the secondary server is
also automated by external processes, then the likelihood that messages in the receive
queue could be drained prior to enabling updating applications is small—conflicts are likely
to occur on the secondary server.
Database
Database
TransmitQ ReceiveQ
Log
1
2 4
ReceiveQ
ServerA
ServerB
3

Get WebSphere Replication Server for z/OS Using Q Replication: High Availability Scenarios for the z/OS Platform now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.