Managing Files

The tree structure of the Unix filesystem makes it easy to organize your files. After you make and edit some files, you may want to copy or move files from one directory to another, or rename files to distinguish different versions of a file. You may want to create new directories each time you start a different project. If you copy a file, it’s worth learning about the subtle sophistication of the cp and CpMac commands: if you copy a file to a directory, it automatically reuses the filename in the new location. This can save lots of typing!

A directory tree can get cluttered with old files you don’t need. If you don’t need a file or a directory, delete it to free storage space on the disk. The following sections explain how to make and remove directories and files.

Creating Directories with mkdir

It’s handy to group related files in the same directory. If you were writing a spy novel, you probably wouldn’t want your intriguing files mixed with restaurant listings. You could create two directories: one for all the chapters in your novel (spy, for example), and another for restaurants (boston.dine).

To create a new directory, use the mkdir program. The syntax is:

mkdir dirname(s)

dirname is the name of the new directory. To make several directories, put a space between each directory name. To continue our example, you would enter:

$ mkdir spy boston.dine

Copying Files

If you’re about to edit a file, you may want to save a copy first. That makes it easy to get back the original version. You should use the cp program when copying plain files and directories containing only plain files. Other files having resource forks, such as Applications, should be copied with CpMac (available only if you have installed Apple’s XCode Tools).

cp

The cp program can put a copy of a file into the same directory or into another directory. cp doesn’t affect the original file, so it’s a good way to keep an identical backup of a file.

To copy a file, use the command:

cp old 
                  new

where old is a pathname to the original file and new is the pathname you want for the copy. For example, to copy the /etc/passwd file into a file called password in your working directory, you would enter:

$ cp /etc/passwd password
$

You can also use the form:

cp old olddir

This puts a copy of the original file old into an existing directory olddir. The copy will have the same filename as the original.

If there’s already a file with the same name as the copy, cp replaces the old file with your new copy. This is handy when you want to replace an old copy with a newer version, but it can cause trouble if you accidentally overwrite a copy you wanted to keep. To be safe, use ls to list the directory before you make a copy there.

Also, cp has an -i (interactive) option that asks you before overwriting an existing file. It works like this:

$ cp -i master existing-file.txt
overwrite existing-file.txt? no
$

You can copy more than one file at a time to a single directory by listing the pathname of each file you want copied, with the destination directory at the end of the command line. You can use relative or absolute pathnames (see Section 3.1 in Chapter 3) as well as simple filenames. For example, let’s say your working directory is /Users/carol (from the filesystem diagram in Figure 3-1). To copy three files called ch1, ch2, and ch3 from /Users/john to a subdirectory called Documents (that’s /Users/carol/ Documents), enter:

$ cp ../john/ch1.doc ../john/ch2.doc ../john/ch3.doc  Documents

Or you could use wildcards and let the shell find all the appropriate files. This time, let’s add the -i option for safety:

$ cp -i ../john/ch[1-3].doc Documents
cp: overwrite work/ch2.doc ? n

There is already a file named ch2 in the Documents directory. When cp asks, answer n to prevent copying ch2. Answering y would overwrite the old ch2.As you saw in Section 3.1.5.2 in Chapter 3, the shorthand form . refers to the copy in the working directory, and .. puts it in the parent directory. For example, the following puts the copies into the working directory:

$ cp ../john/ch[1-3].doc .

One more possibility: when you’re working with home directories, you can use a convenient shorthand ~account to represent John and Carol’s home directory (and ~ by itself to represent your own). So here’s yet another way to copy those three files:

$ cp ~john/ch[1-3.doc] Documents

cp can also copy entire directory trees. Use the option -R, for “recursive.” There are two arguments after the option: the pathname of the top-level directory from which you want to copy and the pathname of the place where you want the top level of the copy to be. As an example, let’s say that a new employee, Asha, has joined John and Carol. She needs a copy of John’s Documents/work directory in her own home directory. See the filesystem diagram in Figure 3-1. Her home directory is /Users/asha. If Asha’s own work directory doesn’t exist yet (important!), she could type the following commands:

$ cd /Users
$ cp -R john/Documents/work asha/work

Or, from her home directory, she could have typed cp -R ../john/Documents/work work. Either way, she’d now have a new subdirectory /Users/asha/work with a copy of all files and subdirectories from /Users/john/Documents/work.

Note

If you give cp -R the wrong pathnames, it can copy a directory tree into itself—running forever until your filesystem fills up!

Problem checklist

The system says something like “cp: cannot copy file to itself”.

If the copy is in the same directory as the original, the filenames must be different.

The system says something like “cp: filename: no such file or directory”.

The system can’t find the file you want to copy. Check for a typing mistake. If a file isn’t in the working directory, be sure to use its pathname.

The system says something like “cp: permission denied”.

You may not have permission to copy a file created by someone else or to copy it into a directory that does not belong to you. Use ls -l to find the owner and the permissions for the file, or use ls -ld to check the directory. If you feel that you should be able to copy a file, ask the file’s owner or use sudo (see Section 3.3 in Chapter 3) to change its access modes.

Copying Mac files with resources

The cp program works on plain files and directories, but the Macintosh system stores applications with resource information. These attributes are known as resource forks, and are used extensively in Classic Mac OS applications and documents. (You will also find them in various places on the Mac OS X filesystem). If you’re a Mac OS 9 veteran, you’ll remember that the resources in the resource fork were only editable with ResEdit, and otherwise were hidden in the system. A file’s resource fork, if it exists, can be seen by looking at a special file called filename/rsrc. For example, Microsoft Word has a resource fork:

$ cd /Applications
$ ls -l Microsoft\ Word
-rwxrwxr-x  1 taylor  taylor  10508000  2 Jul 00:00 Microsoft Word
$ ls -l Microsoft\ Word/rsrc
-rwxrwxr-x  1 taylor  taylor  2781444  2 Jul 00:00 Microsoft Word/rsrc
$ cd Microsoft\ Word

The preceding listing should appear rather puzzling, actually. The file Microsoft Word isn’t a directory, yet there’s a file within as if it were a directory (rsrc). But you can’t cd into Microsoft Word to see the directory. Weird. Further, if you copy Microsoft Word with cp, it won’t copy the contents of the resource fork (in this example, /tmp is a directory used to hold temporary files):

$ cp Microsoft\ Word /tmp
$ ls -l /tmp/Microsoft\ Word
-rwxr-xr-x  1 bjepson  wheel  10568066 Nov 10 14:35 /tmp/Microsoft Word
$ ls -l /tmp/Microsoft\ Word/rsrc
-rwxr-xr-x  1 bjepson  wheel         0 Nov 10 14:35 /tmp/Microsoft Word/rsrc

A special version of cp is used to copy files with resource forks. The program, CpMac, is included with XCode.

Tip

If you find yourself using CpMac or MvMac a lot, add /Developer/Tools to your PATH so you can simply type CpMac rather than the full path to the program. PATH is one of a set of environment variables that help the shell keep track of your particular session. Information on customizing your path is found in Section 1.3 in Chapter 1.

CpMac is found in /Developer/Tools. To copy Microsoft Word and its resources, invoke the following:

$ /Developer/Tools/CpMac Microsoft\ Word /tmp
$ ls -l /tmp/Microsoft\ Word
-rwxrwxrwx  1 bjepson  wheel  10568066 Nov 10 14:37 /tmp/Microsoft Word
$ ls -l /tmp/Microsoft\ Word/rsrc
-rwxrwxrwx  1 bjepson  wheel   2781434 Nov 10 14:37 /tmp/Microsoft Word/rsrc

Tip

In addition to resource forks, some files may include HFS metadata. A legacy of the earlier Mac OS, HFS metadata holds useful information about a file within the first several bytes of the file itself. The Mac OS X Finder will still make use of some of this data, including creator and type codes that, if a document doesn’t have a dot extension such as .mp3, dictate the file’s icon as well as which application should launch when you double-click it. A document file that loses this metadata might display only a generic icon, and the Finder wouldn’t know which application to launch it with.

Renaming and Moving Files with mv

To rename a file, use mv (move). The mv program can also move a file from one directory to another.

The mv command has the same syntax as the cp command:

mv old new

old is the old name of the file and new is the new name. mv will write over existing files, which is handy for updating old versions of a file. If you don’t want to overwrite an old file, be sure that the new name is unique. The Mac OS X version of mv has an -i option for safety:

$ mv chap1.doc intro.doc
$ mv -i chap2.doc intro.doc
mv: overwrite `intro.doc'? n
$

The previous example changed the file named chap1.doc to intro.doc, and then tried to do the same with chap2.doc (answering n cancelled the last operation). If you list your files with ls, you will see that the filename chap1.doc has disappeared, but chap2.doc and intro.doc are preserved.

The mv command can also move a file from one directory to another. As with the cp command, if you want to keep the same filename, you need only to give mv the name of the destination directory.

There’s also a MvMac command, analogous to the CpMac command explained earlier. Again, check by looking for a /rsrc resource file before moving and use MvMac if needed.

Finding Files

If your account has lots of files, organizing them into subdirectories can help you find the files later. Sometimes you may not remember which subdirectory has a file. The find program can search for files in many ways; we’ll look at two.

Change to your home directory so find will start its search there. Then carefully enter one of the following two find commands. (The syntax is strange and ugly—but find does the job!)

$ cd
$ find . -type f -name "chap*" -print
./chap2
./old/chap10b
$ find . -type f -mtime -2 -print
./work/to_do

The first command looks in your working directory (.) and all its subdirectories for files (-type f) whose names start with chap. (find understands wildcards in filenames. Be sure to put quotes around any filename pattern with a wildcard in it, as we did in the example.) The second command looks for all files that have been created or modified in the last two days (-mtime -2). The relative pathnames that find finds start with a dot (./), the name of the working directory, which you can ignore. Worth noting is that -print displays the results on the screen, not on your printer.

Mac OS X also has the locate program to find files quickly. You can use locate to search part or all of a filesystem for a file with a certain name.

First, you need to build the database of filenames. Use the command:

$ sudo /usr/libexec/locate.updatedb

It takes a while for this to complete, as it searches through all your directories looking for files and recording their names. This database is automatically rebuilt weekly, but if you ever add a lot of files and want to add them to the database, rerun this command to rebuild the database with the new files.

Once you have the database, search it with the locate command. For instance, if you’re looking for a file named alpha-test, alphatest, or something like that, try this:

$ locate alpha
/Users/alan/Desktop/alpha3
/usr/local/projects/mega/alphatest

You’ll get the absolute pathnames of files and directories with alpha in their names. (If you get a lot of output, add a pipe to less. See Section 6.2.3 in Chapter 6.) locate may or may not list protected, private files; its listings usually also aren’t completely up to date. The fundamental difference between the two is that find lets you search by file type, contents, and much more, while locate is a simple list of all filenames on the system. To learn much more about find and locate, read their manpages or read the chapter about them in Mac OS X in a Nutshell (O’Reilly).

Removing Files and Directories

You may have finished work on a file or directory and see no need to keep it, or the contents may be obsolete. Periodically removing unwanted files and directories frees storage space.

rm

The rm program removes files. Unlike moving an item to the Trash, no opportunity exists to recover the item before you “Empty the Trash” when using rm.

The syntax is simple:

rm filename(s)

rm removes the named files, as the following example shows:

$ ls
chap10       chap2       chap5    cold
chap1a.old   chap3.old   chap6    haha
chap1b       chap4       chap7    oldjunk
$ rm *.old chap10
$ ls
chap1b    chap4    chap6    cold    oldjunk
chap2     chap5    chap7    haha
$ rm c*
$ ls
haha    oldjunk
$

When you use wildcards with rm, be sure you’re deleting the right files! If you accidentally remove a file you need, you can’t recover it unless you have a copy in another directory or in your backups.

Note

Do not enter rm * carelessly. It deletes all the files in your working directory.

Here’s another easy mistake to make: you want to enter a command such as rm c* (remove all filenames starting with “c”), but instead enter rm c * (remove the file named c and all files!).

It’s good practice to list the files with ls before you remove them. Or, if you use rm’s -i (interactive) option, rm asks you whether you want to remove each file.

rmdir

Just as you can create new directories with mkdir, you can remove them with the rmdir program. As a precaution, rmdir won’t let you delete directories that contain any files or subdirectories; the directory must first be empty. (The rm -r command removes a directory and everything in it. It can be dangerous for beginners, though.)

The syntax is:

rmdir dirname(s)

If a directory you try to remove does contain files, you get a message like “rmdir: dirname not empty”.

To delete a directory that contains some files:

  1. Enter cd dirname to get into the directory you want to delete.

  2. Enter rm * to remove all files in that directory.

  3. Enter cd .. to go to the parent directory.

  4. Enter rmdir dirname to remove the unwanted directory.

Problem checklist

I still get the message “dirname not empty” even after I’ve deleted all the files.

Use ls -a to check that there are no hidden files (names that start with a period) other than . and .. (the working directory and its parent). The following command is good for cleaning up hidden files (which aren’t matched by a simple wildcard such as *). It matches all hidden files except for . (the current directory) and .. (the parent directory):

$ rm -i .[^.]*

Working with Links

If you’ve used the Mac for a while, you’re familiar with aliases, empty files that point to other files on the system. A common use of aliases is to have a copy of an application on the desktop, or to have a shortcut in your home directory. Within the graphical environment, you make aliases by using

Working with Links

-Click and then choosing Make Alias from the context menu. The result of an alias, in Unix, looks like this:

$ ls -l *3*
-rw-r--r--  1 taylor  taylor  1546099 23 Sep 20:58 fig0403.pdf
-rw-r--r--  1 taylor  taylor        0 24 Sep 08:34 fig0403.pdf alias

In this case, the file fig0403.pdf alias is an Aqua alias pointing to the actual file fig0403.pdf in the same directory. But you wouldn’t know it because it appears to be an empty file: the size is shown as zero bytes.

Unix works with aliases differently; on the Unix side, we talk about links, not aliases. There are two types of links possible in Unix, hard links or symbolic links, and both are created with the ln command.

The syntax is:

ln [-s] source target

The -s flag indicates that you’re creating a symbolic link, so to create a second file that links to the file fig0403.pdf, the command would be:

$ ln -s fig0403.pdf neato-pic.pdf

and the results would be:

$ ls -l *pdf
-rw-r--r--  1 taylor  taylor  1532749 23 Sep 20:47 fig0401.pdf
-rw-r--r--  1 taylor  taylor  1539493 23 Sep 20:52 fig0402.pdf
-rw-r--r--  1 taylor  taylor  1546099 23 Sep 20:58 fig0403.pdf
lrwxr-xr-x  1 taylor  taylor       18 24 Sep 08:40 neato-pic.pdf@ ->
     fig0403.pdf

One way to think about symbolic links is that they’re akin to a Stickies note saying “the info you want isn’t here, it’s in file X.” This also implies a peculiar behavior of symbolic links (and Aqua aliases): move, rename, or remove the item being pointed to and you have an orphan link. The system doesn’t remove or update symbolic links automatically.

The other type of link is a hard link, which essentially creates a second name entry for the exact same contents. That is, if we create a hard link to fig0403.pdf, we can then delete the original file, and the contents remain accessible through the second filename — they’re different doors into the same room (as opposed to a Sticky left on a door telling you to go to the second door instead, as would be the case with a symbolic link). Hard links are created by omitting the -s flag:

$ ln mypic.pdf copy2.pdf
$ ls -l mypic.pdf copy2.pdf 
-rw-r--r--  2 taylor  taylor  1546099 24 Sep 08:45 copy2.pdf
-rw-r--r--  2 taylor  taylor  1546099 24 Sep 08:45 mypic.pdf
$ rm mypic.pdf
$ ls -l copy2.pdf 
-rw-r--r--  1 taylor  taylor  1546099 24 Sep 08:45 copy2.pdf

Notice that both files are exactly the same size when the hard link is created. This makes sense because they’re both names to the same underlying set of data, so they should be completely identical. Then, when the original is deleted, the data survives with the second name now as its only name.

Compressing and Archiving Files

Aqua users may commonly use StuffIt’s .sit and .hqx formats for file archives, but Unix users have many other options worth exploring. There are three compression programs included with Mac OS X, though the most popular is gzip (the others are compress and bzip2; read their manpages to learn more about how they differ). There’s also a very common Unix archive format called tar that we’ll cover briefly.

gzip

Though it may initially confuse you into thinking that it’s part of the Zip archive toolset, gzip is actually a compression program that does a very good job of shrinking down individual files for storage and transmission. If you’re sending a file to someone with a dial-up connection, for example, running the file through gzip can significantly reduce its size and make it much more portable. Just as importantly, it can help save space on your disk by letting you compress files you want to keep, but aren’t using currently. gzip works particularly well with tar too, as you’ll see.

The syntax is:

gzip [-v] file(s)

The -v flag offers verbose output, letting the program indicate how much space it saved by compressing the file. Very useful information, as you may expect!

$ ls -l ch06.doc 
-rwxr-xr-x  1 taylor  taylor  138240 24 Sep 08:52 ch06.doc
$ gzip -v ch06.doc
ch06.doc:                75.2% -- replaced with ch06.doc.gz
$ ls -l ch06.doc.gz 
-rwxr-xr-x  1 taylor  taylor  34206 24 Sep 08:52 ch06.doc.gz

You can see that gzip did a great job compressing the file, saving over 75%. Notice that it’s automatically appended a .gz filename suffix to indicate that the file is now compressed. To uncompress the file, just use gunzip:

$ gunzip ch06.doc.gz 
$ ls -l ch06.doc 
-rwxr-xr-x  1 taylor  taylor  138240 24 Sep 08:52 ch06.doc

tar

In the old days, Unix system backups were done to streaming tape devices (today you can only see them in cheesy 60s scifi films, the huge round tape units that randomly spin as data is accessed). The tool of choice for creating backups from Unix systems onto these streaming tape devices was tar, the tape archiver. Fast forward to Mac OS X, and tar continues to be a useful utility, but now it’s used to create files that contain directories and other files within, as an archive. It’s similar to the Zip format, but differs from gzip because its job is to create a file that contains multiple files. gzip, by contrast, makes an existing file shrink as much as possible through compression.

The tar program is particularly helpful when combined with gzip, actually, because it makes creating archive copies of directories simple and effective. Even better, if you use the -z flag to tar, it automatically invokes gzip to compress its output without any further work.

The syntax is:

tar [c|t|x] [flags] files and directories to archive

The tar program is too complex to fully explain here, but in a nutshell, tar -c creates archives, tar -t shows what’s in an existing archive, and tar -x extracts files and directories from an archive. The -f file flag is used to specify the archive name, and the -v flag offers verbose output to let you see what’s going on. As always, man tar will produce lots more information.

$ du -s Masters\ Thesis/
6704    Masters Thesis/
$ tar -czvf masters.thesis.tgz Masters\ Thesis
Masters Thesis/
Masters Thesis/.DS_Store
Masters Thesis/analysis.doc
...
Masters Thesis/Web Survey Results.doc
Masters Thesis/web usage by section.doc
$ ls -l masters.thesis.tgz 
-rw-r--r--  1 taylor  staff  853574 24 Sep 09:20 masters.thesis.tgz

In this example, the directory Masters Thesis is 6.7 MB in size, and hasn’t been accessed in quite a while. This makes it a perfect candidate for a compressed tar archive. This is done by combining the -c (create) -z (compress with gzip) -v (verbose) and -f file (output file; notice that we added the .gz suffix to avoid later confusion about the file type). In under 10 seconds, a new archive file is created, which is less than 1 MB in size, yet contains all the files and directories in the original archive. To unpack the archive, we’d use tar -xvfz masters.thesis.tgz.

Tip

Notice that we gave tar the directory name, rather than a list of files. This ensures that when the directory is unpacked, the files will be put in a new directory (Masters Thesis), rather than filling the current directory. This is a good habit for people who make lots of archives.

Files on Other Operating Systems

Chapter 8 includes Section 8.2, which explains ways to transfer files across a network—possibly to non-Unix operating systems. Mac OS X has the capability of connecting to a variety of different filesystems remotely, including Microsoft Windows, other Unix systems, and even web-based filesystems.

If the Windows-format filesystem is mounted with your other filesystems, you’ll be able to use its files by typing a Unix-like pathname. If you’ve mounted a remote Windows system’s C: drive over a share named winc, you can access the Windows file C:\WORD\REPORT.DOC through the pathname /Volumes/winc/word/report.doc. Indeed, most external volumes are automatically mounted within the /Volumes directory.

Get Learning Unix for Mac OS X Panther now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.