Performing client-free backup and recovery requires the coordination of several steps across at least two hosts. At one point in time, none of the popular commercial backup and recovery applications had software that could automate all the steps without requiring custom scripting by the administrator. All early implementations of client-free backups involved a significant amount of scripting on the part of the administrator, and almost all early implementations of client-free backups were on Unix servers. This was for several reasons, the first of which was demand. Many people had really large, multi-terabyte Unix servers that qualified as candidates for client-free backups. This led to a lot of cooperation between the storage array vendors and the Unix vendors, which led to commands that could run on a Unix system and accomplish the tasks required to make client-free backups possible. Since many of these tasks required steps to be coordinated on multiple computers, the rsh and ssh capabilities of Unix came in handy.
Since NT systems lacked integrated, advanced scripting support, and communications between NT machines were easy for administrators to script, it wasn't simple to design a scripted solution for NT client-free backups. (As you will see later in this section, another key component that was missing was the ability to mount brand new drive letters from the command line.) This, combined with the fact that NT machines tended to use less storage than their monolithic Unix counterparts, meant that there were not a lot of early implementations of client-free backup on NT. However, things have changed in recent years. It isn't uncommon to find very large Windows machines. (I personally have seen one approaching a terabyte.) Scripting and intermachine communication has improved in recent years, but the limitation of not being able to mount drives via the command line has existed until just recently. Therefore, it's good that a few commercial backup software companies are beginning to release client-free backup software that includes the Windows platform. Those of us with very large Windows machines can finally take advantage of this technology.
Windows isn't the only platform for which commercial client-free applications are being written. As of this writing, I am aware of several products (that are either in beta or have just been released) that will provide integrated client-free backup and recovery functionality for at least four versions of Unix (AIX, HP-UX, Solaris, and Tru64).
The next section attempts to explain all the steps a client-free backup system must complete. These steps can be scripted, or they can be managed by an integrated commercial client-free backup software package. Hopefully, by reading the steps in detail, you will have a greater appreciation of the functionality client-free backups provide, as well as the complexity of the application that provide them.
As you can see in Figure 4-5, there is SAN-connected storage that is available to at least two systems: the data server and a backup server. The storage consists of a primary disk set and a backup mirror , which is simply an additional copy of the primary disk set. (In the figure, the primary disk set is mirrored, as represented by the M1/M2.) Note that the SAN-connected storage is depicted as a single, large multihosted storage array with an internal, prewired SAN. The reason for this is that all early implementations of client-free backups have been with such storage arrays. As discussed earlier, there are now enterprise volume managers that will make it possible to do this with JBOD on a SAN, but I use the concept of a large storage array in this example because it's the solution that's available from the most vendors—even if it's the most expensive.
The normal status quo of the backup mirror is that it's left split from the primary disk set. Why it's left split will become clear later in this section.
At the appropriate time, the backup application performs a series of tasks:
Backup Server A, the main backup server, tells Backup Server B to begin the backup.
Unmounts the backup mirror (if it's a mounted filesystem) from Backup Server B.
Exports the volumes from the OS' volume manager on Backup Server B.
Establishes the backup mirror (i.e., "reconnects" it to the primary disk set) by running commands on Backup Server B that communicate with the storage array.
Backup Server B monitors the backup mirror, waiting for the establish to complete.
Backup Server B waits for the appropriate time to split off the backup mirror.
Backup Server B tells Data Server to put the application into backup mode.
Backup Server B splits the backup mirror.
Backup Server B tells Data Server to take the application out of backup mode.
Backup Server B imports the volumes found on the backup mirror.
Backup Server B mounts any filesystems found on the backup mirror.
Backup Server B now performs the backup of the backup mirror via its I/O backplane, instead of the data server's backplane.
After the backup, the filesystems are left mounted and imported on Backup Server B.
These tasks mean that the backup data is sent via the SAN to the backup server and doesn't travel through the client at all, which is why it's referred to as client-free backups. Again, some vendors refer to this as server-free backups, but I reserve that term for a specific type of backup that will be covered in Section 4.4.
You may find yourself asking a few questions:
How do you put the application into backup mode?
How do you import another machine's volumes and filesystems?
Why is the backup mirror left split and mounted most of the time?
This sounds really complicated. Is it?
Before explaining this process in detail, let's examine a few things that make client-free backups possible:
You are going to split off a mirror of the application's disk, and it has no idea you're going to do that. The best way to do this is to stop all I/O operations to the disk during the split. This is usually done by shutting down the application that is using the disk. However, some database products, such as Oracle and Informix, allow you to put their databases into a backup-friendly state without shutting them down. This works fine as well. Informix actually freezes all commits, and Oracle does some extra logging. If you're backing up a file server, you need to stop writes to the files during the time of the split. Otherwise, the filesystem can be corrupted during the process. If you perform an online, split-mirror backup of a SQL Server database, SQL Server's recovery mechanism is supposed to recover the database back to the last checkpoint. However, any transactions that aren't in the online transaction log are lost. (In other words, you can't issue a load transaction command after a split mirror recovery). Microsoft Exchange must normally be shut down during the split.
This isn't to say that a backup and recovery software company can't write an API that communicates with the SQL Server and Exchange APIs. This software product could tell SQL Server and Exchange that it performed a traditional backup, when in reality it's performing a split-mirror backup. In fact, this is already being done for Exchange in the NAS world with snapshots, which act like split mirrors as far as the application is concerned.
If the application is a file server, this is usually not a problem. However, most modern databases are backed up by passing a stream of data to the backup application via a special API. Perhaps two examples will illustrate what I mean.
The standard way to back up Oracle is to use the RMAN (Recovery
Manager) utility. Once configured, your backup software automatically
talks to Oracle's RMAN API, which then passes
streams of data back to the backup application. However, Oracle also
allows you start sqlplus (a command-line utility
you can script), issue the command alter tablespace
backup and then back up that tablespace's
datafiles in any way you want. When
it's time to recover, you shut down the database,
restore those datafiles from backup, then use
Oracle's archived redo logs to redo any transactions
that have occurred since the backup.
Informix now offers similar functionality, but it didn't always do so. Prior to the creation of the onbar utility, you were forced to use the ontape utility to back up Informix. This backed up a database's datafiles or logical logs (transaction logs) to tape. If you didn't back up the datafiles with ontape, you couldn't restore the database to a point in time. For example, suppose you shut down the database and used a third-party backup tool to back up the database's datafiles. You then started up the database and ran it a while, making and recording transactions. If you then shut down the database, using your third-party tool to restore the datafiles, there would be no way within Informix to redo the transactions that occurred since the backup was taken. Therefore, third-party backups with these versions of Informix were useless, preventing you from doing client-free backups of these databases. Now, with the onbar utility, Informix supports what it calls an external backup. As you will see later, this is the tool that you now use to perform client-free backups of Informix.
Although Microsoft Exchange does have transaction logs, the only way to back them up is via the Microsoft-provided tools and APIs. Unless a backup product writes an application that speaks to Microsoft's API and tricks it into thinking that the split mirror backup is a "normal" backup, there is no way to perform an online third-party backup of it. However, you can perform an offline third-party backup of Exchange by shutting down the Exchange services prior to splitting the mirror. You can then restore the Exchange server back to the point the mirror was split, but you lose any messages that have been sent since then. (This is why there are few people performing client-free backups of Exchange.)
Microsoft's SQL Server has a sophisticated recovery mechanism that allows you to perform both online and offline backups, with advantages and disadvantages of both. If you split the mirror while SQL Server is online, you shouldn't suffer a service interruption because of backup, but the recovery process that SQL Server must perform when a restored database is restarted will take much longer. If you shut down SQL Server before splitting the mirror, SQL Server doesn't need to recover the database back to the last checkpoint before it can start replaying transaction logs. However, you will obviously suffer a service interruption whenever you perform a backup. Whether you shut down the database or not, you can't use the dump transaction and load transaction commands in conjunction with a split mirror backup of SQL Server.
This functionality is provided by software on the backup server or data server that communicates with the volume manager of the storage array. With large, multihosted arrays (e.g., Compaq's Enterprise Storage Array, EMC Symmetrix, and Hitachi 9000 series), the client-side software is communicating with the built-in, hardware RAID controller. (An example of this is provided later in this chapter.) Other solutions use software RAID; in this situation, the client-side software is talking to the host that is managing the volumes.
Once you split the backup mirror from the primary disk set, you have to be able to see its disk on the backup server. If you can't, the entire exercise is pointless. This is where Windows NT has typically had a problem. Once the mirror has been split, the associated drives just "appear" on the SCSI bus—roughly the equivalent to plugging in new disk drives without rebooting or running the Disk Manager or Computer Management GUIs. However, Windows 2000 now uses a light version of the Veritas Volume Manager. You can purchase a full-priced version for both NT and Windows 2000, which comes with command line utilities. By the time you read this, the full version of Volume Manager should support the necessary functionality. In fact, Veritas will reportedly have a script included with the product that is specifically designed to help automate client-free backups.
The reason for this is that you are going to be reading one host's disks on another host. If Backup Server B in Figure 4-5 is to back up the data server's disks, it needs to understand the disk labels, any volume manager disk groups and volumes, and any filesystems that may reside on the disks. With few exceptions, this normally means that both servers must be running the same operating system. At least one client-free backup software package has gotten around this limitation by writing custom software that can understand volume managers and filesystems from other operating systems. But, for the most part, the data server and backup server need to be the same operating system. This can be true even if you aren't using a volume manager or filesystem and are backing up raw disks. For example, a Solaris system can't perform I/O operations on a disk that has an NT label on it.
Before continuing this explanation of client-free backups, I must define some terms that will be used in the following example. Please understand that there are multiple vendors that offer this functionality, and they all use different terminology and all work slightly differently.
The lists that follow in the rest of this chapter give examples for Compaq, EMC, and Hitachi using Exchange, Informix, Oracle, and SQL Server databases, and the Veritas Volume Manager and File System. These examples are for clarification only and aren't meant to imply that this functionality is available only on these platforms or to indicate an endorsement of these products. These vendors are listed in alphabetical order throughout, and so no inferences should be made as to the author's preference for one vendor over another. There are several multihosted storage array vendors and several database vendors. Consult your vendor for specifics on how this works on your platform.
The term primary disk set refers to the set of disks that hold the primary copy of the data server's data. What I refer to as a disk set are probably several independent disks or can be one large RAID volume. Whether the primary disk set uses RAID 0, 1, 0+1, 1+0, or 5 is irrelevant. I'm simply referring to the entire set of disks that contain the primary copy of the data server's data. Compaq calls this the primary mirror , EMC calls it the standard , and Hitachi, the primary volume or P-VOL.
The backup mirror device is another set of disks specifically allocated for backup mirror use. When people say "the backup mirror," they are typically referring to a set of backup mirror disks that are associated with a primary disk set. It's often referred to as a "third mirror," because the primary disk set is often mirrored. When synchronized with the primary disk set, this set of disks then represents a third copy of the data—thus the term third mirror. However, not all vendors use mirroring for primary disk sets, so I've chosen to use the term backup mirror instead. Compaq and EMC both call this a BCV, or business continuity volume; Hitachi calls it the secondary volume or S-VOL.
This is a piece of software that synchronizes and splits backup mirror devices based on commands that are issued by the client-side version of the software running on the data or backup server. That is, you run the backup mirror application on your Unix or NT system, and it talks to the disk array and tells it what to do. Compaq has a few ways to do this. The more established way is to use the Enterprise Volume Manager GUI and the Compaq Batch Scheduler. The Batch Scheduler is a web-based GUI that is accessible via any web browser and can automate the creation of BCVs. For scripting, however, you should use the SANWorks Command Scripter, which allows direct command-line access to the StorageWorks controllers. EMC's application for this is Timefinder; Hitachi's is Shadowimage.
To establish a backup mirror is to tell the primary disk set to copy its data over to the backup mirror, thus synchronizing it with the primary disk set. Many backup mirror applications offer an incremental establish, which copies only the sectors that have changed since the last establish. Some refer to this as silvering the mirror, a reference to the silver that is put on the back of a "real" mirror.
When you split a backup mirror, you tell the disk array to break the link between the primary disk set and the backup mirror. This is typically done to back up the backup mirror. Once it's finished, you have a complete copy of the primary disk set on another set of devices, and those devices become visible on the backup server. This is also called breaking the mirror.
To restore the backup mirror is to copy the backup mirror to the primary disk set. Once the command to do this is issued, the restore usually appears to have been completed instantaneously. As will be explained in more detail later, requests for data that has not yet been restored is redirected to the mirror.
As discussed earlier in this chapter, a main server is the server in a backup environment that schedules all backups and stores in the database information about what backups went to what tape. It may or may not have tape drives connected to it.
A device server, as discussed earlier, is a server that has tape drives connected to it. The backups that go to these tape drives are scheduled by the main server. The main server also keeps track of what files went to what tapes on this device server.
Figure 4-6 again illustrates a typical backup mirror backup configuration. There is a main backup server (Backup Server A) that is connected to a tape library (via SCSI or Fibre Channel) and connected to the data server via a LAN. The data server is connected via Fibre Channel to the storage array, and its datafiles and transaction logs reside on its primary disk sets. The datafiles have a backup mirror associated with them, and it's connected via Fibre Channel to a device server (Backup Server B), and that device server is connected via SCSI to another tape library.
Multihosted storage arrays have a SAN built into them. You can put extra switches or hubs between the storage array and the servers connecting to it, but it isn't necessary unless the number of servers that you wish to connect to the storage array exceeds the number of Fibre Channel ports the array has.
To back up a database, you must back up its datafiles (and other files) and its transaction logs. The datafiles contain the actual database, and the transaction logs contain the changes to the database since the last time you backed up the datafiles. Once you restore the datafiles, the transaction logs can then replay the transactions that occurred between the time of the datafile backup and the time of the failure. Therefore, you must back up the transaction logs continually, because they are essential to a good recovery.
As shown in Figure 4-7, transaction logs are often backed up to disk (1), and then to tape (2). This is my personal preference, but it isn't required. The reason I prefer to back up to disk and then to tape is that replaying the transaction logs is the slowest part of the restore, and having the transaction logs available on disk significantly speeds up the process. The log disk may or may not be on the storage array and doesn't need a backup mirror.
You can also see in Figure 4-5 and Figure 4-6 that the transaction logs don't need to be backed up via the split mirror. The reason for this is that there is no supported methods for backing up transaction logs this way. Therefore, there is no reason to put them on the backup mirror.
As discussed previously, the backup mirror is left split from the primary disk set (3). The following list details how transaction logs are backed with Exchange, Informix, Oracle, and SQL Server:
Informix's transaction logs are called logical logs and can be backed up directly to tape or disk. Again, I recommend backing up to disk first, followed by a backup to tape, since recovering logical logs from disk is much faster than recovering them from tape. If you plan to perform client-free backups, you need to use Informix's onbar command, which provides the log_full.sh script to kick off the backups of logical logs whenever one becomes full. Although the setup in the next section may look involved, once it's complete, logical log backup and recovery is actually easy with Informix.
Oracle's transaction logs are called redo logs , and the backup to disk is accomplished with Oracle's standard archiving procedure. This is done by placing the database in archive log mode and specifying automatic archiving. As soon as one online redo log is filled, Oracle switches to the next one and begins copying the full log to the archived redo log location. To back up these archived redo logs to tape, you can use an incremental backup that runs once an hour or more often if you prefer.
As discussed previously, you will not be able to use SQL Server's dump transaction command in conjunction with a split backup. If you need point-in-time recovery, you need to choose another backup method.
It's now time to back up the datafiles. In order to do that, you must:
Establish the backup mirror to the primary disk set.
Put the database in backup mode, or sync the filesystem.
Split the backup mirror.
Take the database out of backup mode.
Import and mount the backup mirror volumes to the backup server.
Back up the backup mirror volumes to tape.
The details of how these steps are accomplished are discussed next.
As shown in Figure 4-8, you must establish (reattach) the backup mirror to the primary disk set (1), causing the mirror to copy to the backup mirror any regions that have been modified since the backup mirror was last established (2).
The following list shows how this works for the various products:
This functionality is available on the RAID Array 8000 (RA8000) and the Enterprise Storage Array 12000 (ESA12000) using HSG80 controllers in switch or hub configurations. The SANworks Enterprise Volume Manager and Command Scripter products support Windows NT/2000, Solaris, and Tru64.
Although Compaq uses the term BCV to refer to a set of disks that comprise a backup mirror, their arrays don't have any commands that interact with the entire BCV as one entity. All establishing and splitting of mirrors takes place on the individual disk level. Therefore, if you have a striped set of mirrored disks to which you want to assign a third mirror, you need assign a third disk to each set of mirrored disks that make up that stripe. In order to do this, you issue the following commands:
First, the nopolicy flag tells the mirror not to add members to the disk group until you do so manually. Then you add a member to the mirrorset by specifying that it has one more member than it already has. (The number 3 in this example assumes that there was already a mirrored pair to which you are adding a third member.) Then, you specify the name of the disk that is to be that third mirror.
Once this relationship is established for each disk in the stripe, it
will take some time to copy the data from the existing mirror to the
backup mirror. To check the status of this copy, issue the command
These commands can be issued via a terminal connected directly to the array, or via the Compaq Command Line Scripter tool discussed earlier.
On EMC, establishing the BCV (i.e., backup mirror) to the standard (i.e., primary disk set) is done with the symbcv establish command that is part of the EMC Timefinder package. (Timefinder is available on both Unix and Windows.) When issuing this command, you need to tell it which BCV to establish. Since a BCV is actually a set of devices that are collectively referred to as "the BCV," EMC uses the concept of device groups to tell Timefinder which BCV devices to synchronize with which standard devices. Therefore, prior to issuing the symbcv establish command, you need to create a device group that contains all the standards and the BCV devices to which they are associated. In order to establish the BCV to the standard, you issue the following command:
# symbcv establish -g
The -g option specifies the name of the device group that you created above. If the BCV has been previously established and split, you can also specify the -i flag that tells Timefinder to perform an incremental establish. This tells Timefinder to look at both the BCV and the standard devices and copy over only those regions that have changed since the BCV was last established. It even allows you to modify the BCV while it's split. If you modify any regions on the BCV devices (such as when you overwrite the private regions of each device with Veritas Volume Manager so that you can import them to another system), those regions will also be refreshed from the standard, even if they have not been changed on the primary disk set.
Once the BCV is established, you can check the progress of the
synchronization with the symbcv verify -g
device_group command. This shows
the number of "BCV invalids" and
"standard invalids" that still have
to be copied. It also sometimes lists a percentage complete column,
but I have not found that column to be particularly reliable or
On HDS, establishing the
(i.e., backup mirror) to the primary
mirror is done with the
command that is part of the HDS
Shadowimage package. (Shadowimage is available on both Unix and NT.)
When issuing this command, you need to tell it which secondary volume
(S-VOL) to establish. Since an S-VOL is actually a pool of devices
that are collectively referred to as "reserve
pool," HDS uses the concept of groups
to tell Shadowimage which S-VOL devices to synchronize
with which primary volumes (P-VOL). Therefore, prior to issuing the
device_group command, you need to create a
device group that contains all the primary mirrors and the BCV
devices to which they are associated.
In order to establish (i.e., synchronize) the S-VOL to the P-VOL, issue the following command:
# paircreate -g
If the S-VOL has been previously established and split, you can also specify the pairresync command that tells Shadowimage to perform a resynchronization of the pairs.This tells Shadowimage to apply the writes to the P-VOL, which are logged in cache, to the S-VOL because it has been split. It even allows you to modify the S-VOL while it's split. If you modify any regions on the S-VOL devices (such as when you overwrite the private regions of each device with Veritas Volume Manager so that you can import them to another system), those regions are also refreshed from the primary, even if they have not been changed on the primary mirror.
Once the S-VOL is established, you can check the progress of the
synchronization with the pairdisplay
or -m all command. This shows you the number of
"transition volumes" that still
have to be copied and the percentage of copying already done.
As shown in Figure 4-9, once the backup mirror is fully synchronized with the primary disk set, you have to tell the data server to put the database in backup mode (1). In most client-server backup environments, this is normally accomplished by telling the backup software to run a script prior to a backup. The problem here is the backup client, where the script is normally run, isn't the host where the script needs to run. The client in Figure 4-9 is actually the Backup Server B, not Data Server. This means that you need to use something like rsh or ssh to pass the command from Backup Server B to Data Server. There are now ssh servers available for both Unix and Windows. The commands you need to run on Data Server will obviously depend on the application.
Here are the steps for Exchange, Informix, Oracle, and SQL Server:
net stop "Microsoft Exchange Directory" /y net stop "Microsoft Exchange Event Service" /y net stop "Microsoft Exchange Information Store" /y net stop "Microsoft Exchange Internet Mail Service" /y net stop "Microsoft Exchange Message Transfer Agent" /y net stop "Microsoft Exchange System Attendant" /y
Informix is relatively easy. All you have to do is specify the appropriate environment variables and issue the command onmode -c block. Unlike Oracle, however, once this command is issued, all commits will hang until the onmode -c unblock command is issued.
Putting Oracle databases in backup mode is
no easy task. You need to know the names of every tablespace, and
place each tablespace into backup mode using the command
backup. Many people create a script to automatically
discover all the tablespaces and place each in backup mode. Putting
the tablespaces into backup mode causes a minor performance hit, but
the database will continue to function normally.
net stop MSSQLSERVER
As shown in Figure 4-9, once the database is put into backup mode, the backup mirror is split from the primary disk set (2).
This split requires a number of commands on the Compaq array. First,
you set the nopolicy flag with the command
nopolicy. Then you split the mirror by issuing the command
each disk in the BCV. Then you create a unit
name for each disk with the command add
Finally, you make that unit visible to the backup server by
issuing the commands set unit
disable_access_path=all and set unit
To do this on EMC, run the command symbcv split -g
on the backup server. To do this on Hitachi, run the
command pairsplit -g
(On both EMC and Hitachi, the backup mirror devices are made visible
to the backup server as part of the storage array's
Now that the backup mirror is split, you can take the database out of backup mode. Here are the details of taking the databases out of backup mode:
To start Exchange automatically after splitting the mirror, place the following series of commands in a batch file and run it:
net start "Microsoft Exchange Directory" /y net start "Microsoft Exchange Event Service" /y net start "Microsoft Exchange Information Store" /y net start "Microsoft Exchange Internet Mail Service" /y net start "Microsoft Exchange Message Transfer Agent" /y net start "Microsoft Exchange System Attendant" /y
Again, Informix is relatively easy. All you have to do is specify the appropriate environment variables and issue the command onmode -c unblock. Any commits that were issued while the database was blocked will now complete. If you perform the block, split, and unblock fast enough, the users of the application will never realize that commits were frozen for a few seconds.
Again, you must determine the name of each tablespace in a particular
ORACLE_SID, and use the command alter tablespace
backup to take each tablespace out of backup mode.
To start SQL Server automatically after splitting the mirror, run the following command:
net start MSSQLSERVER
This step is the most OS-specific and complicated step, since it involves using OS-level volume manager and filesystem commands. However, the prominence of the Veritas Volume Manager makes this a little simpler. Veritas Volume Manager is now available for HP-UX, Solaris, NT, and Windows 2000. Also, the native volume manager for Tru64 is an OEM version of Volume Manager.
On the data server, the logical volumes are mounted as filesystems (or drives) or used as raw devices on Unix for a database. As shown in Figure 4-9, you need to figure out what backup mirror devices belong to which disk group, import those disk groups to the backup server (3), activate the disk groups (which turns on the logical volumes), and mount the filesystems if there are any. This is probably the most difficult part of the procedure, but it's possible to automate it.
It isn't possible to assign drive letters to devices via the command line in NT and Windows 2000 without the full-priced version of the Veritas Volume Manager. Therefore, the steps in the list that follows assume you are using this product. Except where noted, the commands discussed next should work roughly the same on both Unix and NT systems. If you don't want to pay for the full-priced volume manager product, there is an alternate, although much less automated, method discussed at the end of the Veritas-based steps.
You can write a script that discovers the names of the disk groups on
the backup mirror and uses the command vxdg -n
volume-group to import the
disk/volume groups from the backup server. (The -t
option specifies that the new name is only temporary.) The
following list is a brief summary of how this works.
It's not meant to be an exact step-by-step guide; it
gives you an overall picture of what's involved in
discovering and mounting the backup mirror.
First, you need a list of disks that are on the primary disk set. Compaq's show disks command, EMC's inq command, and Hitachi's raidscan command provide this information.
To get a list of which disk groups each disk was in, run the command
vxdisk -s list
devicename on Unix or
disk_name on Windows.
You now have a list of disk groups that can be imported from the backup mirror.
Import each disk group with the vxdg -n
A vxrecover is necessary on Unix to activate the volumes on the disk.
On Unix, you may also need to mount the filesystems found on the disk. On NT/2000, the Volume Manager takes care of assigning drive letters (i.e., mounting) the drives.
As mentioned before, if you are running Windows and don't wish to pay for the full-priced version of Volume Manager, you can't assign the drive letters via the command line. Since it isn't reasonable to expect you to go to the GUI every night during backups, an alternative method is to perform the following steps manually each time there is a configuration change to the backup mirror:
Split the backup mirror as described in the previous section, Section 188.8.131.52.3. This makes the drives accessible to the backup server.
Start the Computer Management GUI in Windows 2000 and later (or the Disk Administrator GUI in NT), and tell it to find and assign drive letters to new disk drives.
A reboot may be necessary to effect the changes.
After the drive letters have been assigned, you may establish and split the backup mirror at will. However, bad things are liable to happen if you reboot the server (or try to access the backup mirror's disks) while the backup mirror is established.
This is the easy part. Define a backup configuration to back up the filesystems, drives, or raw devices that were found on the backup mirror. Since the volumes are actually imported and/or mounted on the backup server, the backup server should be listed as the client for this backup configuration. As shown in Figure 4-9, the data is sent from the backup mirror to the backup server (4a), and then to tape (4b).
The transaction logs should be backed up to disk and to tape (as shown in Figure 4-7) before, during, and after this operation.
Nobody cares if you can back up—only if you can recover. How do you recover from such a convoluted setup? This is really where the beauty of having the backup mirror stay split and mounted comes in. Unless the entire storage array is destroyed, it's available for an immediate restore. However, let's start with the worst-case scenario: the entire storage array was destroyed and has been replaced.
As shown in Figure 4-10, the worst possible thing that can happen is if something catastrophic happens to the storage array, destroying the primary disk set, backup mirror, and transaction log backup disk. Because of the extreme amount of proactive monitoring most of the enterprise storage arrays have, this is rather unlikely to happen. In fact, with the backup mirror left split most of the time, about the only probable way for both the backup mirror and the primary disk set to be damaged is complete destruction of the physical unit, such as by fire. The first step in this recovery is that the storage array must, of course, be repaired or reinstalled.
Once the storage array is fully functional, you need to restore the backup mirror from tape as shown in Figure 4-11 (1a-b). While this is going on, the transaction log backups can also be restored from tape to disk (2a, b, and c). This allows them to be immediately available once the backup mirror is restored.
Once the recovery of the backup mirror has been completed from tape, we can move on to the next step. (Of course, if you don't have to recover the backup mirror from tape, you can start with the next step.)
This recovery is much more likely to occur and will occur under the three following circumstances:
If you lost the entire storage array, repaired it, and recovered the backup mirror from tape (as discussed in the previous section)
If you lost both sides of the primary disk set but did not lose the backup mirror
If the database or filesystem residing on the primary disk set was logically corrupted but the backup mirror was split at the time
The process now is to recover the primary disk set from the backup mirror, replay the transaction logs, and bring the database online.
By running the Compaq
EVM GUI you can easily connect the third mirror back to the primary
mirror and restore the data to the first mirror set. If you want to
do this via the command line, use the unmirror
command to turn off the primary mirrors. Then use the
mirror command to create a one-way mirror of each
disk in the BCV, followed by set
nopolicy, and set
members=[n+2]. Now issue the command set
each disk from the primary mirror. This places the primary disks as
additional disks in the temporary mirrorsets created for recovery,
causing the data on the backup mirror disks to be copied to the
primary disks. Then run the commands add unit
disk-name, set unit
to grant the primary server access to the new mirror. Once the copy
had been completed, you can remove the additional mirror with the
reduce mirror command.
To recover the primary disk
set (standard) from the backup mirror (BCV), you must tell the backup
mirror application to do so. With EMC, you issue the command
symbcv restore -g
device_group, which begins
copying the BCV to the standard as shown in Figure 4-12. With EMC, the moment this command is
executed, the primary disk set appears to have been restored and is
immediately accessible. How does this work? Some people seem to think
that the backup mirror is simply mounted as the primary disk set
during a restore. That isn't what happens at all.
The backup mirror isn't visible to the data server
at any time, so this isn't even possible.
Take a look at Figure 4-12 and assume that some time has passed. The data in block "A" has been copied to block "B," (2) but the rest of the data on the BCV has not yet been copied. If the application asks for the data that has been restored from the BCV, it will receive it from the BCV (3). If the application requests data that has not yet been copied (4), Timefinder forwards the request to the BCV (5). The data is then presented to the application as if it was already on the standard.
To recover the primary mirror from the
S-VOL, tell the Shadowimage application to do so. With HDS, issue the
command pairresync -restore -g
device_group, which begins
copying the S-VOL to the P-VOL. In this way, HDS and EMC are similar;
the moment this command is executed, the primary mirror appears to
have been restored and is immediately accessible.
Since the disk-based transaction log dumps were recovered in a previous step, you can now start the transaction log restore from the disk-based transaction log backups (see Figure 4-13).
Here's an overview of how to do this for Exchange, Informix, Oracle, and SQL Server:
Since you can't restore any transaction logs, you will simply need to restart the Exchange services after the restore. You can do this via the Exchange GUI.
First, you need to tell Informix you've performed what it calls an "external recovery" of the chunks. To do this, issue the command onbar -r -e -p, which tells Informix to examine the chunks to make sure they've been recovered and to know which logical log it needs to start with. Once this completes successfully, you can tell Informix to perform the logical log recovery with the command onbar -r -l. (You can perform both steps with one command (onbar -r -e), but I prefer to perform each step separately. The -p option of the onbar -r -e -p command tells it to perform only the physical recovery.)
Oracle recoveries can be quite complex and are covered in detail elsewhere in this book. Assuming that you have done a restore of one or more datafiles from the backup mirror to the primary disk set, the following commands replay the redo logs:
$ sqlplus /nolog > connect /as sysdba > startup mount ; > recover database ;
You can't recover transactions that have occurred since the split-mirror backup was taken. If you need point-in-time recovery, use another backup method.
Once the transaction log recovery is complete, the database is fully online, whether or not the backup mirror has been fully restored to the primary disk set (see Figure 4-14).
Isn't that recovery scenario a beautiful thing? Imagine recovering a 10-TB database in under a minute. If you need the ultimate in instant recovery, it's difficult to beat the concept of client-free backup and recovery.
Some people love to have an entire copy of their database or file server sitting there at all times just waiting for an instant restore. Others feel that this is a waste of money. All that extra disk can cost quite a bit. They like the idea of client-free backups—backing up the data through a system other than the system that is using the data. However, they don't want to pay for another set of disks to store the backup mirror. What do they do?
The first thing some do is to share the backup mirror. Whether the backup mirror is in an enterprise storage array or a group of JBOD that is made available on the SAN for that purpose, it's possible to establish this backup mirror to more than one primary disk set. Obviously you can't back up more than one primary disk set at one time using this method, but suppose you have primary disk sets A and B. Once you finish establishing, splitting, and backing up primary disk set A, you can establish the backup mirror to primary disk set B, split it, then back it up again. Many people rotate their backup mirrors this way.
However, some storage vendors have a solution that's less expensive than having a separate backup mirror for each primary disk set and is more elegant than rotating a single backup mirror between multiple primary disk sets: snapshots.
Again, let's start with an enterprise storage array or a group of JBOD being managed by an enterprise volume manager. Recall that in the previous section, Section 4.2, some volume managers and storage arrays can create a virtual copy, or a snapshot. To another host that can see it, it looks the same as a backup mirror. It's a static "picture" of what the volumes looked like at a particular time.
Creating the snapshot is equivalent to the step where you split off the backup mirror. There is no step where you must establish the snapshot. You simply place the application into a backup status, create the snapshot, and then take the application out of backup status. After that, the storage array or enterprise volume manager software makes the device visible to another host. You then perform the same steps as outlined in the previous section, Section 4.3.2, for importing and mounting the volumes on the backup server. Once that is accomplished, you can back it up just like any other device.
Similar to the backup mirror backup, the snapshot can typically be left around in case of recovery. However, most snapshot recoveries aren't going to be as fast as the backup mirror recovery discussed earlier. Although the data can still be copied from disk to disk, the recovery may not be as instantaneous. However, since this section of the industry is changing so fast, this may no longer be the case by the time you read this.
It should also be noted that, unlike the backup mirror, the snapshot still depends on the primary disk set's disks. Since it's merely a virtual copy of the disks, if the original disks are physically damaged, the snapshot becomes worthless. Therefore, snapshots provide quick recovery only in the case of logical corruption of the data, since the snapshot still contains a noncorrupted copy of the data.