50 Configuration and Tuning GPFS for Digital Media Environments
Figure 2-11 Relationship and workflow between components, functions, and data access
2.2.7 Replication (disk failure groups)
As we have mentioned before, GPFS is a journaling file system (logs all
metadata activity). To maintain the file system online (mounted) and consistent,
several protection mechanisms are used:
򐂰 Data and metadata replication
򐂰 Log file replication
򐂰 Disk failure groups
The disk failure groups are used for keeping the file system (not the cluster!)
online based on maintaining a quorum file system descriptors (disks) available. If
you are familiar with AIX logical volume manager, the information about a volume
group is stored on the Volume Group Descriptor Area (VGDA), and a quorum of
VGDAs must be available for the VG to be either brought online or to remain
varied on.
GPFS reserves a storage area on every disk which is part of a file system for the
file system descriptor (FSDesc), but by default writes the FSDesc on three of the
disks which are part of that file system (if available).
FS Mgr.
Token Mgr.
lease
mmfsd
node A
MetaNode
lease
mmfsd
node B
lease
mmfsd
node C
MM M M M MM M M M
MM M M M MM M M M
MM M M M MM M M M
MM M M M MM M M M
MM M M M MM M M M
CfgMgr
mmfsd
node D
Request for
accessing a file
1
2
3
4
a.mpg
Chapter 2. Introduction to GPFS 51
This behavior poses the problem that in case of multiple disks failure, the file
system may be unmounted due to the lack of FSDesc quorum (even though the
total number of disks still available for that file system is larger that 50%+1).
To solve this problem, GPFS uses disk failure groups. The characteristics of the
failure groups are:
򐂰 A number identifying the failure group to which a disk belongs
򐂰 Is used by GPFS during metadata and data placement on the disks of a file
system
򐂰 Ensures that no two replicas of the same block will become unavailable due to
a single failure
򐂰 Are set by GPFS by default
򐂰 Can also be defined either at NSD creation time (mmcrnsd/mmcrvsd
commands) or later (mmchdisk command)
򐂰 The syntax for the disk descriptor file is the following
DiskName:PrimaryNSDServer:BackupNSDServer:DiskUsage:FailureGroup:NSDName
򐂰 It is important to set failure groups correctly to have proper/effective file
system replication (metadata and/or data)
The following paragraphs explain in more detail considerations for choosing the
hardware and the failure groups to provide maximum availability for your file
system.
Data replication
The GPFS replication feature allows you to specify how many copies of a file to
maintain. File system replication assures that the latest updates to critical data
are preserved in the event of hardware failure. During configuration, you assign a
replication factor to indicate the total number of copies you want to store,
currently only two copies are supported.
Replication allows you to set different levels of protection for each file or one level
for an entire file system. Since replication uses additional disk capacity and
stresses the available write throughput, you might want to consider replicating
only file systems that are frequently read from but seldom written to or only for
metadata. In addition, only the primary copy is used for reading, unlike for
instance in RAID 1, hence no read performance increase can be achieved.
Failure groups
GPFS failover support allows you to organize your hardware into a number of
failure groups to minimize single points of failure. A failure group is a set of disks
that share a common point of failure that could cause them all to become
simultaneously unavailable, such as the disk enclosures, RAID controllers or
storage units such as the DS4500.

Get Configuration and Tuning GPFS for Digital Media Environments now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.