Chapter 1. Background

This chapter provides a brief history of the development of the Unix system. Understanding where and how Unix developed and the intent behind its design will help you use the tools better. The chapter also introduces the guiding principles of the Software Tools philosophy, which are then demonstrated throughout the rest of the book.

Unix History

It is likely that you know something about the development of Unix, and many resources are available that provide the full story. Our intent here is to show how the environment that gave birth to Unix influenced the design of the various tools.

Unix was originally developed in the Computing Sciences Research Center at Bell Telephone Laboratories.[1] The first version was developed in 1970, shortly after Bell Labs withdrew from the Multics project. Many of the ideas that Unix popularized were initially pioneered within the Multics operating system; most notably the concepts of devices as files, and of having a command interpreter (or shell ) that was intentionally not integrated into the operating system. A well-written history may be found at http://www.bell-labs.com/history/unix.

Because Unix was developed within a research-oriented environment, there was no commercial pressure to produce or ship a finished product. This had several advantages:

  • The system was developed by its users. They used it to solve real day-to-day computing problems.

  • The researchers were free to experiment and to change programs as needed. Because the user base was small, if a program needed to be rewritten from scratch, that generally wasn't a problem. And because the users were the developers, they were free to fix problems as they were discovered and add enhancements as the need for them arose.

  • Unix itself went through multiple research versions, informally referred to with the letter "V" and a number: V6, V7, and so on. (The formal name followed the edition number of the published manual: First Edition, Second Edition, and so on. The correspondence between the names is direct: V6 = Sixth Edition, and V7 = Seventh Edition. Like most experienced Unix programmers, we use both nomenclatures.) The most influential Unix system was the Seventh Edition, released in 1979, although earlier ones had been available to educational institutions for several years. In particular, the Seventh Edition system introduced both awk and the Bourne shell, on which the POSIX shell is based. It was also at this time that the first published books about Unix started to appear.

  • The researchers at Bell Labs were all highly educated computer scientists. They designed the system for their personal use and the use of their colleagues, who also were computer scientists. This led to a "no nonsense" design approach; programs did what you told them to do, without being chatty and asking lots of "are you sure?" questions.

  • Besides just extending the state of the art, there existed a quest for elegance in design and problem solving. A lovely definition for elegance is "power cloaked in simplicity."[2] The freedom of the Bell Labs environment led to an elegant system, not just a functional one.

Of course, the same freedom had a few disadvantages that became clear as Unix spread beyond its development environment:

  • There were many inconsistencies among the utilities. For example, programs would use the same option letter to mean different things, or use different letters for the same task. Also, the regular-expression syntaxes used by different programs were similar, but not identical, leading to confusion that might otherwise have been avoided. (Had their ultimate importance been recognized, regular expression-matching facilities could have been encoded in a standard library.)

  • Many utilities had limitations, such as on the length of input lines, or on the number of open files, etc. (Modern systems generally have corrected these deficiencies.)

  • Sometimes programs weren't as thoroughly tested as they should have been, making it possible to accidentally kill them. This led to surprising and confusing "core dumps." Thankfully, modern Unix systems rarely suffer from this.

  • The system's documentation, while generally complete, was often terse and minimalistic. This made the system more difficult to learn than was really desirable.[3]

Most of what we present in this book centers around processing and manipulation of textual, not binary, data. This stems from the strong interest in text processing that existed during Unix's early growth, but is valuable for other reasons as well (which we discuss shortly). In fact, the first production use of a Unix system was doing text processing and formatting in the Bell Labs Patent Department.

The original Unix machines (Digital Equipment Corporation PDP-11s) weren't capable of running large programs. To accomplish a complex task, you had to break it down into smaller tasks and have a separate program for each smaller task. Certain common tasks (extracting fields from lines, making substitutions in text, etc.) were common to many larger projects, so they became standard tools. This was eventually recognized as being a good thing in its own right: the lack of a large address space led to smaller, simpler, more focused programs.

Many people were working semi-independently on Unix, reimplementing each other's programs. Between version differences and no need to standardize, a lot of the common tools diverged. For example, grep on one system used -i to mean "ignore case when searching," and it used -y on another variant to mean the same thing! This sort of thing happened with multiple utilities, not just a few. The common small utilities were named the same, but shell programs written for the utilities in one version of Unix probably wouldn't run unchanged on another.

Eventually the need for a common set of standardized tools and options became clear. The POSIX standards were the result. The current standard, IEEE Std. 1003.1-2004, encompasses both the C library level, and the shell language and system utilities and their options.

The good news is that the standardization effort paid off. Modern commercial Unix systems, as well as freely available workalikes such as GNU/Linux and BSD-derived systems, are all POSIX-compliant. This makes learning Unix easier, and makes it possible to write portable shell scripts. (However, do take note of Chapter 14.)

Interestingly enough, POSIX wasn't the only Unix standardization effort. In particular, an initially European group of computer manufacturers, named X/Open, produced its own set of standards. The most popular was XPG4 (X/Open Portability Guide, Fourth Edition), which first appeared in 1988. There was also an XPG5, more widely known as the UNIX 98 standard, or as the "Single UNIX Specification." XPG5 largely included POSIX as a subset, and was also quite influential.[4]

The XPG standards were perhaps less rigorous in their language, but covered a broader base, formally documenting a wider range of existing practice among Unix systems. (The goal for POSIX was to make a standard formal enough to be used as a guide to implementation from scratch, even on non-Unix platforms. As a result, many features common on Unix systems were initially excluded from the POSIX standards.) The 2001 POSIX standard does double duty as XPG6 by including the X/Open System Interface Extension (or XSI, for short). This is a formal extension to the base POSIX standard, which documents attributes that make a system not only POSIX-compliant, but also XSI-compliant. Thus, there is now only one formal standards document that implementors and application writers need refer to. (Not surprisingly, this is called the Single Unix Standard.)

Throughout this book, we focus on the shell language and Unix utilities as defined by the POSIX standard. Where it's important, we'll include features that are XSI-specific as well, since it is likely that you'll be able to use them too.



[1] The name has changed at least once since then. We use the informal name "Bell Labs" from now on.

[2] I first heard this definition from Dan Forsyth sometime in the 1980s.

[3] The manual had two components: the reference manual and the user's manual. The latter consisted of tutorial papers on major parts of the system. While it was possible to learn Unix by reading all the documentation, and many people (including the authors) did exactly that, today's systems no longer come with printed documentation of this nature.

[4] The list of X/Open publications is available at http://www.opengroup.org/publications/catalog/.

Get Classic Shell Scripting now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.