How Linux Organizes Data

In order to make the most effective use of your Linux system, you must understand how Linux organizes data. If you’re familiar with Windows or another operating system, you’ll find it easy to learn how Linux organizes data, because most operating systems organize their data in similar ways. This section explains how Linux organizes data and introduces you to several important Linux commands that work with directories and files.

Devices

Linux receives data from, sends data to, and stores data on devices. A device generally corresponds to a hardware unit, such as a keyboard or serial port. However, a device may have no hardware counterpart: the kernel creates several pseudodevices that you can access as devices but that have no physical existence. Moreover, a single hardware unit may correspond to several devices. For example, Linux defines each partition of a disk drive as a distinct device. Table 4-1 describes some typical Linux devices; not every system provides all these devices, and some systems provide devices not shown in the table.

Table 4-1. Typical Linux Devices

Device

Description

atibm

Bus mouse

audio

Sound card

cdrom

CD-ROM drive

console

Current virtual console

fdn

Floppy drive (n designates the drive; for example, fd0 is the first floppy drive)

ftape

Streaming tape drive not supporting rewind

hdxn

Non-SCSI hard drive (x designates the drive and n designates the partition; for example, hda1 is the first partition of the first non-SCSI hard drive)

inportbm

Bus mouse

lpn

Parallel port (n designates the device number; for example, lp0 is the first parallel port)

modem

Modem

mouse

Mouse

nftape

Streaming tape drive supporting rewind

nrftn

Streaming tape drive supporting rewind (n designates the device number; for example, nrft0 is the first streaming tape drive)

nstn

Streaming SCSI tape drive not supporting rewind (n designates the device number; for example, nst0 is the first streaming SCSI tape drive)

null

Pseudodevice that accepts unlimited output

printer

Printer

psaux

Auxiliary pointing device, such as a trackball or the knob on IBM’s ThinkPad

rftn

Streaming tape drive not supporting rewind (n designates the device number; for example, rft0 is the first streaming tape drive)

scdn

SCSI device (n designates the device number; for example, scd0 is the first SCSI device)

sdxn

SCSI hard drive (x designates the drive and n designates the partition; for example, sda1 is the first partition of the first SCSI hard drive)

srn

SCSI CD-ROM (n designates the drive; for example, sr0 is the first SCSI CD-ROM)

stn

Streaming SCSI tape drive supporting rewind (n designates the device number; for example, st0 is the first streaming SCSI tape drive)

ttyn

Virtual console (n designates the particular virtual console; for example, tty0 is the first virtual console)

ttySn

Modem (n designates the port; for example, ttyS0 is an incoming modem connection on the first serial port), serial device (such as Palm Pilot), or some PCMCIA devices

zero

Pseudodevice that supplies an inexhaustible stream of zero-bytes

Filesystems

Whether you’re using Windows or Linux, you must format a partition before you can store data on it. The installation procedure automatically formats the partitions you create during system installation. When Linux formats a partition, it writes special data, called a filesystem, on the partition. The filesystem organizes the available space and provides a directory that lets you assign a name to each file, which is a set of stored data. A filesystem also enables you to group files into directories, which function much like the folders you create using the Windows Explorer: directories store information about the files they contain.

Every CD-ROM and floppy diskette must also have a filesystem. The filesystem of a CD-ROM is written when the disk is created; the filesystem of a floppy diskette is rewritten each time you format it.

Windows 98 lets you choose to format a partition as a FAT or FAT32. Windows NT/2000 also support the NTFS filesystem type. Linux supports a wider variety of filesystem types; Table 4-2 summarizes the most common ones. The most important filesystem types are ext3 and ext2, which are used for Linux native partitions; msdos, which is used for FAT partitions (and floppy diskettes) of the sort created by MS-DOS and Microsoft Windows; and iso9660, which is used for CD-ROMs. Linux also provides the vfat filesystem, which is used for FAT32 partitions of the sort created by Windows 9x. Linux also supports reading Windows NT/2000 NTFS filesystems; however, the support for writing such partitions is not enabled in the standard Red Hat Linux kernel.

Table 4-2. Common Filesystem Types

Filesystem

Description

coherent

A filesystem compatible with that used by Coherent Unix

ext

The predecessor of the ext2 filesystem; supported for compatibility

ext2

The standard Linux filesystem

ext3

The new standard journaling filesystem for Red Hat Linux.

hpfs

A filesystem compatible with that used by IBM’s OS/2

iso9660

The standard filesystem used on CD-ROMs

minix

An old Linux filesystem, still occasionally used on floppy diskettes

msdos

A filesystem compatible with Microsoft’s FAT filesystem, used by MS-DOS and Windows

nfs

A filesystem compatible with Sun’s Network File System

ntfs

A filesystem compatible with that used by Microsoft Windows NT’s NTFS filesystem

reiserfs

A Linux filesystem designed for high-reliability, large-capacity storage systems

sysv

A filesystem compatible with that used by AT&T’s System V Unix

ufs

A filesystem used on BSD and Sun Solaris systems

vfat

A filesystem compatible with Microsoft’s FAT32 filesystem, used by Windows 9x

xenix

A filesystem compatible with that used by Xenix

xfs

A filesystem used on SGI systems

The ext3 filesystem type is a new feature of Red Hat Linux 7.2; previous versions of Red Hat Linux were based on the ext2 filesystem type. An ext3 filesystem stores data in the same basic way as an ext2 filesystem; however, an ext3 filesystem includes a special journal that records changes to the filesystem. If the filesystem becomes corrupted—perhaps because the system was powered off rather than properly shut down—the journal can be used to recover data that might otherwise be lost. Moreover, an ext3 filesystem can be recovered more quickly than an ext2 filesystem. The combination of greater reliability and faster recovery is critically important when Linux is used to host a server with one or more large hard disks, but the combination is a convenience even for desktop users.

Directories and Paths

If you’ve used MS-DOS, you’re familiar with the concepts of files and directories and with various MS-DOS commands that work with them. Under Linux, files and directories work much as they do under MS-DOS.

Home and working directories

When you log in to Linux, you’re placed in a special directory known as your home directory. Generally, each user has a distinct home directory, where the user creates personal files. This makes it simple for the user to find files previously created, because they’re kept separate from the files of other users.

The working directory -- or current working directory, as it’s sometimes called—is the directory you’re currently working in. When you log in to Linux, your working directory is initialized as your home directory.

The directory tree

The directories of a Linux system are organized as a hierarchy. Unlike MS-DOS, which provides a separate hierarchy for each partition, Linux provides a single hierarchy that includes every partition. The topmost directory of the directory tree is the root directory, which is written using a forward slash (/), not the backward slash (\) used by MS-DOS to designate a root directory.

Figure 4-1 shows a hypothetical Linux directory tree. The root directory contains six subdirectories: /bin, /dev, /etc, /home, /tmp, and /usr. The /home directory has two subdirectories; each is the home directory of a user and has the same name as the user who owns it. The user named bill has created two subdirectories in his home directory: books and school. The user named patrick has created the single school subdirectory in his home directory.

A hypothetical Linux directory tree

Figure 4-1. A hypothetical Linux directory tree

Each directory (other than the root directory) is contained in a directory known as its parent directory. For example, the parent directory of the bill directory is home.

Tip

The root user has a special home directory, /root. This directory is commonly called “slash root” to distinguish it from the root directory, /.

Absolute and relative pathnames

If you look closely at Figure 4-1, you’ll see that two directories named school exist: one is a subdirectory of bill, and the other is a subdirectory of patrick. To avoid the confusion that could result when several directories have the same name, directories are specified using pathnames.

There are two kinds of pathnames: absolute and relative. The absolute pathname of a directory traces the location of the directory beginning at the root directory; you form the pathname as a list of directories, separated by forward slashes (/). For example, the absolute pathname of the unique directory named bill is /home/bill. The absolute pathname of the school subdirectory of the bill directory is /home/bill/school. The absolute pathname of the identically named school subdirectory of the patrick directory is /home/patrick/school.

When a subdirectory is many levels below the root directory, its absolute pathname may be long and cumbersome. In that case, it may be more convenient to use a relative pathname, which uses the current working directory, rather than the root directory, as its starting point. For example, suppose that the bill directory is the current working directory; you can refer to its books subdirectory by the relative pathname books. Notice that a relative pathname can never begin with a forward slash, whereas an absolute pathname must begin with a forward slash. As a second example, suppose that the /home directory is the current working directory. The relative pathname of the school subdirectory of the bill directory would be bill/school; the relative pathname of the identically named subdirectory of the patrick directory would be patrick/school.

Linux provides two special directory names. Using a single dot (.) as a directory name is equivalent to specifying the working directory. Using two dots (..) within a pathname takes you up one level in the current path, to the parent directory. For example, if the working directory is /home/bill, .. refers to the /home directory. Similarly, if the current working directory is /home/bill and the directory tree is that shown in Figure 4-1, the path ../patrick/school refers to the directory /home/patrick/school.

File Permissions

Unlike Windows 98, but like other varieties of Unix and Windows NT/2000, Linux is a multiuser operating system. Therefore, it includes mechanisms to protect data from unauthorized access. The primary protection mechanism restricts access to directories and files based on the identity of the user who requests access and on access modes assigned to each directory and file.

Each directory and file has an associated user, called the owner. The user who creates a file initially becomes the owner of the file. Each user belongs to one or more sets of users known as groups. Each directory and file has an associated group, which is assigned when the directory or file is created.

Access permissions determine what operations a user can perform on a directory or file. Table 4-3 lists the possible permissions and explains the meaning of each. Notice that permissions work differently for directories than for files. For example, permission r denotes the ability to list the contents of a directory or read the contents of a file. A directory or file can have more than one permission. Only the listed permissions are granted; any other operations are prohibited. For example, a user who had file permission rw could read or write the file but could not execute it, as indicated by the absence of the execute permission, x.

Table 4-3. Access Permissions

Permission

Meaning for a directory

Meaning for a file

r

List the directory

Read contents

w

Create or remove files

Write contents

x

Access files and subdirectories

Execute

The access modes of a directory or file consist of three sets of permissions:

User/Owner

Applies to the owner of the file

Group

Applies to users who are members of the group assigned to the file

Other

Applies to other users

The ls command, which you’ll meet in Chapter 7, lists the file access modes in the second column of its long output format, as shown in Figure 4-2. The GNOME and KDE file managers use this same format. The column contains nine characters: the first three specify the access allowed the owner of the directory or file, the second three specify the access allowed users in the same group as the directory or file, and the final three specify the access allowed to other users (see Figure 4-3).

Access modes as shown by the ls command

Figure 4-2. Access modes as shown by the ls command

Access modes specify three permissions

Figure 4-3. Access modes specify three permissions

Mounting and Unmounting Filesystems

You cannot access a hard drive partition, CD-ROM, or floppy disk until the related device or partition is mounted. Mounting a device checks the status of the device and readies it for access. Linux can be configured to automatically mount a device or partition when it boots or when you launch a desktop environment. By default, GNOME and KDE automatically mount removable media devices such as CD-ROMs and floppy disks.

Before you can remove media from a device, you must unmount it. You can unmount a device by using a desktop environment or issuing a command. For your convenience, the system automatically unmounts devices when it shuts down. A device can be unmounted only if it’s not in use. For example, if a user’s current working directory is a directory of the device, the device cannot be unmounted. See Chapter 7 for more information on mounting and unmounting devices with the mount and umount commands.

Get Learning Red Hat Linux, Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.