Chapter 1. Introduction to Networking

History

The idea of networking is probably as old as telecommunications itself. Consider people living in the Stone Age, when drums may have been used to transmit messages between individuals. Suppose caveman A wants to invite caveman B over for a game of hurling rocks at each other, but they live too far apart for B to hear A banging his drum. What are A’s options? He could 1) walk over to B’s place, 2) get a bigger drum, or 3) ask C, who lives halfway between them, to forward the message. The last option is called networking.

Of course, we have come a long way from the primitive pursuits and devices of our forebears. Nowadays, we have computers talk to each other over vast assemblages of wires, fiber optics, microwaves, and the like, to make an appointment for Saturday’s soccer match.[1] In the following description, we will deal with the means and ways by which this is accomplished, but leave out the wires, as well as the soccer part.

We define a network as a collection of hosts that are able to communicate with each other, often by relying on the services of a number of dedicated hosts that relay data between the participants. Hosts are often computers, but need not be; one can also think of X terminals or intelligent printers as hosts. A collection of hosts is also called a site.

Communication is impossible without some sort of language or code. In computer networks, these languages are collectively referred to as protocols. However, you shouldn’t think of written protocols here, but rather of the highly formalized code of behavior observed when heads of state meet, for instance. In a very similar fashion, the protocols used in computer networks are nothing but very strict rules for the exchange of messages between two or more hosts.

TCP/IP Networks

Modern networking applications require a sophisticated approach to carry data from one machine to another. If you are managing a Linux machine that has many users, each of whom may wish to simultaneously connect to remote hosts on a network, you need a way of allowing them to share your network connection without interfering with each other. The approach that a large number of modern networking protocols use is called packet switching. A packet is a small chunk of data that is transferred from one machine to another across the network. The switching occurs as the datagram is carried across each link in the network. A packet-switched network shares a single network link among many users by alternately sending packets from one user to another across that link.

The solution that Unix systems, and subsequently many non-Unix systems, have adopted is known as TCP/IP. When learning about TCP/IP networks, you will hear the term datagram, which technically has a special meaning but is often used interchangeably with packet. In this section, we will have a look at underlying concepts of the TCP/IP protocols.

Introduction to TCP/IP Networks

TCP/IP traces its origins to a research project funded by the United States Defense Advanced Research Projects Agency (DARPA) in 1969. The ARPANET was an experimental network that was converted into an operational one in 1975 after it had proven to be a success.

In 1983, the new protocol suite TCP/IP was adopted as a standard, and all hosts on the network were required to use it. When ARPANET finally grew into the Internet (with ARPANET itself passing out of existence in 1990), the use of TCP/IP had spread to networks beyond the Internet itself. Many companies have now built corporate TCP/IP networks, and the Internet has become a mainstream consumer technology. It is difficult to read a newspaper or magazine now without seeing references to the Internet; almost everyone can use it now.

For something concrete to look at as we discuss TCP/IP throughout the following sections, we will consider Groucho Marx University (GMU), situated somewhere in Freedonia, as an example. Most departments run their own Local Area Networks, while some share one and others run several of them. They are all interconnected and hooked to the Internet through a single high-speed link.

Suppose your Linux box is connected to a LAN of Unix hosts at the mathematics department, and its name is erdos. To access a host at the physics department, say quark, you enter the following command:

               $ ssh quark.school.edu

Enter password:
Last login: Wed Dec  3 18:21:25 2003 from 10.10.0.1
quark$

At the prompt, you enter your password. You are then given a shell[2] on quark, to which you can type as if you were sitting at the system’s console. After you exit the shell, you are returned to your own machine’s prompt. You have just used one of the instantaneous, interactive applications that uses TCP/IP: secure shell.

While being logged into quark, you might also want to run a graphical user interface application, like a word processing program, a graphics drawing program, or even a World Wide Web browser. The X Windows System is a fully network-aware graphical user environment, and it is available for many different computing systems. To tell this application that you want to have its windows displayed on your host’s screen, you will need to make sure that you’re SSH server and client are capable of tunneling X. To do this, you can check the sshd_config file on the system, which should contain a line like this:

X11Forwarding yes

If you now start your application, it will tunnel your X Window System applications so that they will be displayed on your X server instead of quark’s. Of course, this requires that you have X11 runnning on erdos. The point here is that TCP/IP allows quark and erdos to send X11 packets back and forth to give you the illusion that you’re on a single system. The network is almost transparent here.

Of course, these are only examples of what you can do with TCP/IP networks. The possibilities are almost limitless, and we’ll introduce you to more as you read on through the book.

We will now have a closer look at the way TCP/IP works. This information will help you understand how and why you have to configure your machine. We will start by examining the hardware and slowly work our way up.

Ethernets

The most common type of LAN hardware is known as Ethernet. In its simplest form, it consists of a single cable with hosts attached to it through connectors, taps, or transceivers. Simple Ethernets are relatively inexpensive to install, which together with a net transfer rate of 10, 100, 1,000, and now even 10,000 megabits per second (Mbps), accounts for much of its popularity.

Ethernets come in many flavors: thick, thin, and twisted pair. Older Ethernet types such as thin and thick Ethernet, rarely in use today, each use a coaxial cable, differing in diameter and the way you may attach a host to this cable. Thin Ethernet uses a T-shaped “BNC” connector, which you insert into the cable and twist onto a plug on the back of your computer. Thick Ethernet requires that you drill a small hole into the cable and attach a transceiver using a “vampire tap.” One or more hosts can then be connected to the transceiver. Thin and thick Ethernet cable can run for a maximum of 200 and 500 meters, respectively, and are also called 10-base2 and 10-base5. The “base” refers to “baseband modulation” and simply means that the data is directly fed onto the cable without any modem. The number at the start refers to the speed in megabits per second, and the number at the end is the maximum length of the cable in hundreds of metres. Twisted pair uses a cable made of two pairs of copper wires and usually requires additional hardware known as active hubs. Twisted pair is also known as 10-baseT, the “T” meaning twisted pair. The 100 Mbps version is known as 100-baseT, and not surprisingly, 1000 Mbps is called 1000-baseT or gigabit.

To add a host to a thin Ethernet installation, you have to disrupt network service for at least a few minutes because you have to cut the cable to insert the connector. Although adding a host to a thick Ethernet system is a little complicated, it does not typically bring down the network. Twisted pair Ethernet is even simpler. It uses a device called a hub or switch that serves as an interconnection point. You can insert and remove hosts from a hub or switch without interrupting any other users at all.

Thick and thin Ethernet deployments are somewhat difficult to find anymore because they have been mostly replaced by twisted pair deployments. This has likely become a standard because of the cheap networking cards and cables—not to mention that it’s almost impossible to find an old BNC connector in a modern laptop machine.

Wireless LANs are also very popular. These are based on the 802.11a/b/g specification and provide Ethernet over radio transmission. Offering similar functionality to its wired counterpart, wireless Ethernet has been subject to a number of security issues, namely surrounding encryption. However, advances in the protocol specification combined with different encryption keying methods are quickly helping to alleviate some of the more serious security concerns. Wireless networking for Linux is discussed in detail in Chapter 18.

Ethernet works like a bus system, where a host may send packets (or frames) of up to 1,500 bytes to another host on the same Ethernet. A host is addressed by a 6-byte address hardcoded into the firmware of its Ethernet network interface card (NIC). These addresses are usually written as a sequence of two-digit hex numbers separated by colons, as in aa:bb:cc:dd:ee:ff.

A frame sent by one station is seen by all attached stations, but only the destination host actually picks it up and processes it. If two stations try to send at the same time, a collision occurs. Collisions on an Ethernet are detected very quickly by the electronics of the interface cards and are resolved by the two stations aborting the send, each waiting a random interval and re-attempting the transmission. You’ll hear lots of stories about collisions on Ethernet being a problem and that utilization of Ethernets is only about 30 percent of the available bandwidth because of them. Collisions on Ethernet are a normal phenomenon, and on a very busy Ethernet network you shouldn’t be surprised to see collision rates of up to about 30 percent. Ethernet networks need to be more realistically limited to about 60 percent before you need to start worrying about it.[3]

Other Types of Hardware

In larger installations, or in legacy corporate environments, Ethernet is usually not the only type of equipment used. There are many other data communications protocols available and in use. All of the protocols listed are supported by Linux, but due to space constraints we’ll describe them briefly. Many of the protocols have HOWTO documents that describe them in detail, so you should refer to those if you’re interested in exploring those that we don’t describe in this book.

One older and quickly disappearing technology is IBM’s Token Ring network. Token Ring is used as an alternative to Ethernet in some LAN environments, and runs at lower speeds (4 Mbps or 16 Mbps). In Linux, Token Ring networking is configured in almost precisely the same way as Ethernet, so we don’t cover it specifically.

Many national networks operated by telecommunications companies support packet-switching protocols. Previously, the most popular of these was a standard named X.25. It defines a set of networking protocols that describes how data terminal equipment, such as a host, communicates with data communications equipment (an X.25 switch). X.25 requires a synchronous data link and therefore special synchronous serial port hardware. It is possible to use X.25 with normal serial ports if you use a special device called a Packet Assembler Disassembler (PAD). The PAD is a standalone device that provides asynchronous serial ports and a synchronous serial port. It manages the X.25 protocol so that simple terminal devices can make and accept X.25 connections. X.25 is often used to carry other network protocols, such as TCP/IP. Since IP datagrams cannot simply be mapped onto X.25 (or vice versa), they are encapsulated in X.25 packets and sent over the network. There is an implementation of the X.25 protocol available for Linux, but it will not be discussed in depth here.

A protocol commonly used by telecommunications companies is called Frame Relay. The Frame Relay protocol shares a number of technical features with the X.25 protocol, but is much more like the IP protocol in behavior. Like X.25, Frame Relay requires special synchronous serial hardware. Because of their similarities, many cards support both of these protocols. An alternative is available that requires no special internal hardware, again relying on an external device called a Frame Relay Access Device (FRAD) to manage the encapsulation of Ethernet packets into Frame Relay packets for transmission across a network. Frame Relay is ideal for carrying TCP/IP between sites. Linux provides drivers that support some types of internal Frame Relay devices.

If you need higher-speed networking that can carry many different types of data, such as digitized voice and video, alongside your usual data, Asynchronous Transfer Mode (ATM) is probably what you’ll be interested in. ATM is a new network technology that has been specifically designed to provide a manageable, high-speed, low-latency means of carrying data and control over the Quality of Service (QoS). Many telecommunications companies are deploying ATM network infrastructure because it allows the convergence of a number of different network services into one platform, in the hope of achieving savings in management and support costs. ATM is often used to carry TCP/IP. The Networking HOWTO offers information on the Linux support available for ATM.

Frequently, radio amateurs use their radio equipment to network their computers; this is commonly called packet radio. One of the protocols used by amateur radio operators is called AX.25 and is loosely derived from X.25. Amateur radio operators use the AX.25 protocol to carry TCP/IP and other protocols, too. AX.25, like X.25, requires serial hardware capable of synchronous operation, or an external device called a Terminal Node Controller to convert packets transmitted via an asynchronous serial link into packets transmitted synchronously. There are a variety of different sorts of interface cards available to support packet radio operation; these cards are generally referred to as being “Z8530 SCC based,” named after the most popular type of communications controller used in the designs. Two of the other protocols that are commonly carried by AX.25 are the NetRom and Rose protocols, which are network layer protocols. Since these protocols run over AX.25, they have the same hardware requirements. Linux supports a fully featured implementation of the AX.25, NetRom, and Rose protocols. The AX25 HOWTO is a good source of information on the Linux implementation of these protocols.

Other types of Internet access involve dialing up a central system over slow but cheap serial lines (telephone, ISDN, and so on). These require yet another protocol for transmission of packets, such as SLIP or PPP, which will be described later.

The Internet Protocol

Of course, you wouldn’t want your networking to be limited to one Ethernet or one point-to-point data link. Ideally, you would want to be able to communicate with a host computer regardless of what type of physical network it is connected to. For example, in larger installations such as Groucho Marx University, you usually have a number of separate networks that have to be connected in some way. At GMU, the math department runs two Ethernets: one with fast machines for professors and graduates, and another with slow machines for students.

This connection is handled by a dedicated host called a gateway that handles incoming and outgoing packets by copying them between the two Ethernets and the FDDI fiber optic cable. For example, if you are at the math department and want to access quark on the physics department’s LAN from your Linux box, the networking software will not send packets to quark directly because it is not on the same Ethernet. Therefore, it has to rely on the gateway to act as a forwarder. The gateway (named sophus) then forwards these packets to its peer gateway niels at the physics department, using the backbone network, with niels delivering it to the destination machine. Data flow between erdos and quark is shown in Figure 1-1.

The three steps of sending a datagram from erdos to quark
Figure 1-1. The three steps of sending a datagram from erdos to quark

This scheme of directing data to a remote host is called routing, and packets are often referred to as datagrams in this context. To facilitate things, datagram exchange is governed by a single protocol that is independent of the hardware used: IP, or Internet Protocol. In Chapter 2, we will cover IP and the issues of routing in greater detail.

The main benefit of IP is that it turns physically dissimilar networks into one apparently homogeneous network. This is called internetworking, and the resulting “meta-network” is called an internet. Note the subtle difference here between an internet and the Internet. The latter is the official name of one particular global internet.

Of course, IP also requires a hardware-independent addressing scheme. This is achieved by assigning each host a unique 32-bit number called the IP address. An IP address is usually written as four decimal numbers, one for each 8-bit portion, separated by dots. For example, quark might have an IP address of 0x954C0C04, which would be written as 149.76.12.4. This format is also called dotted decimal notation and sometimes dotted quad notation. It is increasingly going under the name IPv4 (for Internet Protocol, Version 4) because a new standard called IPv6 offers much more flexible addressing, as well as other modern features. It will be at least a year after the release of this edition before IPv6 is in use.

You will notice that we now have three different types of addresses: first there is the host’s name, like quark, then there is an IP address, and finally, there is a hardware address, such as the 6-byte Ethernet address. All these addresses somehow have to match so that when you type ssh quark, the networking software can be given quark’s IP address; and when IP delivers any data to the physics department’s Ethernet, it somehow has to find out what Ethernet address corresponds to the IP address.

We will deal with these situations in Chapter 2. For now, it’s enough to remember that these steps of finding addresses are called hostname resolution, for mapping hostnames onto IP addresses, and address resolution, for mapping the latter to hardware addresses.

IP over Serial Lines

On serial lines, a “de facto” standard exists known as Serial Line IP (SLIP). A modification of SLIP known as Compressed SLIP (CSLIP), performs compression of IP headers to make better use of the relatively low bandwidth provided by most serial links. Another serial protocol is Point-to-Point Protocol (PPP). PPP is more modern than SLIP and includes a number of features that make it more attractive. Its main advantage over SLIP is that it isn’t limited to transporting IP datagrams, but is designed to allow just about any protocol to be carried across it. This book discusses PPP in Chapter 6.

The Transmission Control Protocol

Sending datagrams from one host to another is not the whole story. If you log in to quark, you want to have a reliable connection between your ssh process on erdos and the shell process on quark. Thus, the information sent to and fro must be split into packets by the sender and reassembled into a character stream by the receiver. Trivial as it seems, this involves a number of complicated tasks.

A very important thing to know about IP is that, by intent, it is not reliable. Assume that 10 people on your Ethernet started downloading the latest release of the Mozilla web browser source code from GMU’s FTP server. The amount of traffic generated might be too much for the gateway to handle because it’s too slow and it’s tight on memory. Now if you happen to send a packet to quark, sophus might be out of buffer space for a moment and therefore unable to forward it. IP solves this problem by simply discarding it. The packet is irrevocably lost. It is therefore the responsibility of the communicating hosts to check the integrity and completeness of the data and retransmit it in case of error.

This process is performed by yet another protocol, Transmission Control Protocol (TCP), which builds a reliable service on top of IP. The essential property of TCP is that it uses IP to give you the illusion of a simple connection between the two processes on your host and the remote machine so that you don’t have to care about how and along which route your data actually travels. A TCP connection works essentially like a two-way pipe that both processes may write to and read from. Think of it as a telephone conversation.

TCP identifies the end points of such a connection by the IP addresses of the two hosts involved and the number of a port on each host. Ports may be viewed as attachment points for network connections. If we are to strain the telephone example a little more, and you imagine that cities are like hosts, one might compare IP addresses to area codes (where numbers map to cities), and port numbers to local codes (where numbers map to individual people’s telephones). An individual host may support many different services, each distinguished by its own port number.

In the ssh example, the client application (ssh) opens a port on erdos and connects to port 22 on quark, to which the sshd server is known to listen. This action establishes a TCP connection. Using this connection, sshd performs the authorization procedure and then spawns the shell. The shell’s standard input and output are redirected to the TCP connection so that anything you type to ssh on your machine will be passed through the TCP stream and be given to the shell as standard input.

The User Datagram Protocol

Of course, TCP isn’t the only user protocol in TCP/IP networking. Although suitable for applications like ssh, the overhead involved is prohibitive for applications like NFS, which instead uses a sibling protocol of TCP called User Datagram Protocol (UDP). Just like TCP, UDP allows an application to contact a service on a certain port of the remote machine, but it doesn’t establish a connection for this. Instead, you use it to send single packets to the destination service—hence its name.

Assume that you want to request a small amount of data from a database server. It takes at least three datagrams to establish a TCP connection, another three to send and confirm a small amount of data each way, and another three to close the connection. UDP provides us with a means of using only two datagrams to achieve almost the same result. UDP is said to be connectionless, and it doesn’t require us to establish and close a session. We simply put our data into a datagram and send it to the server; the server formulates its reply, puts the data into a datagram addressed back to us, and transmits it back. While this is both faster and more efficient than TCP for simple transactions, UDP was not designed to deal with datagram loss. It is up to the application, a nameserver, for example, to take care of this.

More on Ports

Ports may be viewed as attachment points for network connections. If an application wants to offer a certain service, it attaches itself to a port and waits for clients (this is also called listening on the port). A client who wants to use this service allocates a port on its local host and connects to the server’s port on the remote host. The same port may be open on many different machines, but on each machine only one process can open a port at any one time.

An important property of ports is that once a connection has been established between the client and the server, another copy of the server may attach to the server port and listen for more clients. This property permits, for instance, several concurrent remote logins to the same host, all using the same port 513. TCP is able to tell these connections from one another because they all come from different ports or hosts. For example, if you log in twice to quark from erdos, the first ssh client may use the local port 6464, and the second one could use port 4235. Both, however, will connect to the same port 513 on quark. The two connections will be distinguished by use of the port numbers used at erdos.

This example shows the use of ports as rendezvous points, where a client contacts a specific port to obtain a specific service. In order for a client to know the proper port number, an agreement has to be reached between the administrators of both systems on the assignment of these numbers. For services that are widely used, such as ssh, these numbers have to be administered centrally. This is done by the Internet Engineering Task Force (IETF), which regularly releases an RFC titled Assigned Numbers (RFC-1700). It describes, among other things, the port numbers assigned to well-known services. Linux uses a file called /etc/services that maps service names to numbers.

It is worth noting that, although both TCP and UDP connections rely on ports, these numbers do not conflict. This means that TCP port 22, for example, is different from UDP port 22.

The Socket Library

In Unix operating systems, the software performing all the tasks and protocols described above is usually part of the kernel, and so it is in Linux. The programming interface most common in the Unix world is the Berkeley Socket Library. Its name derives from a popular analogy that views ports as sockets and connecting to a port as plugging in. It provides the bind call to specify a remote host, a transport protocol, and a service that a program can connect or listen to (using connect, listen, and accept). The socket library is somewhat more general in that it provides not only a class of TCP/IP-based sockets (the AF_INET sockets), but also a class that handles connections local to the machine (the AF_UNIX class). Some implementations can also handle other classes, like the Xerox Networking System (XNS) protocol or X.25.

In Linux, the socket library is part of the standard libc C library. It supports the AF_INET and AF_INET6 sockets for TCP/IP and AF_UNIX for Unix domain sockets. It also supports AF_IPX for Novell’s network protocols, AF_ X25 for the X.25 network protocol, AF_ATMPVC and AF_ATMSVC for the ATM network protocol and AF_AX25, AF_NETROM, and AF_ ROSE sockets for Amateur Radio protocol support. Other protocol families are being developed and will be added in time.

Linux Networking

As it is the result of a concerted effort of programmers around the world, Linux wouldn’t have been possible without the global network. So it’s not surprising that in the early stages of development, several people started to work on providing it with network capabilities. A UUCP implementation was running on Linux almost from the very beginning, and work on TCP/IP-based networking started around autumn 1992, when Ross Biro and others created what has now become known as Net-1.

After Ross quit active development in May 1993, Fred van Kempen began to work on a new implementation, rewriting major parts of the code. This project was known as Net-2. The first public release, Net-2d, was made in the summer of 1993 (as part of the 0.99.10 kernel), and has since been maintained and expanded by several people, most notably Alan Cox. Alan’s original work was known as Net-2Debugged. After heavy debugging and numerous improvements to the code, he changed its name to Net-3 after Linux 1.0 was released. The Net-3 code was further developed for Linux 1.2 and Linux 2.0. The 2.2 and later kernels use the Net-4 version network support, which remains the standard official offering today.

The Net-4 Linux Network code offers a wide variety of device drivers and advanced features. Standard Net-4 protocols include SLIP and PPP (for sending network traffic over serial lines), PLIP (for parallel lines), IPX (for Novell compatible networks), Appletalk (for Apple networks) and AX.25, NetRom, and Rose (for amateur radio networks). Other standard Net-4 features include IP firewalling (discussed in Chapter 7), IP accounting (Chapter 8), and IP Masquerade (Chapter 9). IP tunneling in a couple of different flavors and advanced policy routing are supported. A very large variety of Ethernet devices are supported, in addition to support for some FDDI, Token Ring, Frame Relay, and ISDN, and ATM cards.

Additionally, there are a number of other features that greatly enhance the flexibility of Linux. These features include interoperability with the Microsoft Windows network environment, in a project called Samba, discussed in Chapter 16, and an implementation of the Novell NCP (NetWare Core Protocol).[4]

Different Streaks of Development

There have been, at various times, varying network development efforts active for Linux.

Fred continued development after Net-2Debugged was made the official network implementation. This development led to the Net-2e, which featured a much revised design of the networking layer. Fred was working toward a standardized Device Driver Interface (DDI), but the Net-2e work has ended now.

Yet another implementation of TCP/IP networking came from Matthias Urlichs, who wrote an ISDN driver for Linux and FreeBSD. For this driver, he integrated some of the BSD networking code in the Linux kernel. That project, too, is no longer being worked on.

There has been a lot of rapid change in the Linux kernel networking implementation, and change is still the watchword as development continues. Sometimes this means that changes also have to occur in other software, such as the network configuration tools. While this is no longer as large a problem as it once was, you may still find that upgrading your kernel to a later version means that you must upgrade your network configuration tools, too. Fortunately, with the large number of Linux distributions available today, this is a quite simple task.

The Net-4 network implementation is now a standard and is in use at a very large number of sites around the world. Much work has been done on improving the performance of the Net-4 implementation, and it now competes with the best implementations available for the same hardware platforms. Linux is proliferating in the Internet Service Provider environment, and is often used to build cheap and reliable World Wide Web servers, mail servers, and news servers for these sorts of organizations. There is now sufficient development interest in Linux that it is managing to keep abreast of networking technology as it changes, and current releases of the Linux kernel offer the next generation of the IP protocol, IPv6, as a standard offering, which will be discussed at greater detail in Chapter 13.

Where to Get the Code

It seems odd now to remember that in the early days of the Linux network code development, the standard kernel required a huge patch kit to add the networking support to it. Today, network development occurs as part of the mainstream Linux kernel development process. The latest stable Linux kernels can be found on ftp://ftp.kernel.org in /pub/linux/kernel/v2.x/, where x is an even number. The latest experimental Linux kernels can be found on ftp://ftp.kernel.org in /pub/linux/kernel/v2.y/, where y is an odd number. The kernel.org distributions can also be accessed via HTTP at http://www.kernel.org. There are Linux kernel source mirrors all over the world.

Maintaining Your System

Throughout this book, we will mainly deal with installation and configuration issues. Administration is, however, much more than that—after setting up a service, you have to keep it running, too. For most services, only a little attendance will be necessary, while some, such as mail, require that you perform routine tasks to keep your system up to date. We will discuss these tasks in later chapters.

The absolute minimum in maintenance is to check system and per-application logfiles regularly for error conditions and unusual events. Often, you will want to do this by writing a couple of administrative shell scripts and periodically running them from cron. The source distributions of some major applications contain such scripts. You only have to tailor them to suit your needs and preferences.

The output from any of your cron jobs should be mailed to an administrative account. By default, many applications will send error reports, usage statistics, or logfile summaries to the root account. This makes sense only if you log in as root frequently; a much better idea is to forward root’s mail to your personal account by setting up a mail alias as described in Chapters Chapter 11 and Chapter 12.

However carefully you have configured your site, Murphy’s Law guarantees that some problem will surface eventually. Therefore, maintaining a system also means being available for complaints. Usually, people expect that the system administrator can at least be reached via email as root, but there are also other addresses that are commonly used to reach the person responsible for a specific aspect of maintenence. For instance, complaints about a malfunctioning mail configuration will usually be addressed to postmaster, and problems with the news system may be reported to newsmaster or usenet. Mail to hostmaster should be redirected to the person in charge of the host’s basic network services, and the DNS name service if you run a nameserver.

System Security

Another very important aspect of system administration in a network environment is protecting your system and users from intruders. Carelessly managed systems offer malicious people many targets. Attacks range from password guessing to Ethernet snooping, and the damage caused may range from faked mail messages to data loss or violation of your users’ privacy. We will mention some particular problems when discussing the context in which they may occur and some common defenses against them.

This section will discuss a few examples and basic techniques for dealing with system security. Of course, the topics covered cannot treat all security issues in detail; they merely serve to illustrate the problems that may arise. Therefore, reading a good book on security is an absolute must, especially in a networked system.

System security starts with good system administration. This includes checking the ownership and permissions of all vital files and directories and monitoring use of privileged accounts. The COPS program, for instance, will check your filesystem and common configuration files for unusual permissions or other anomalies. Another tool, Bastille Linux, developed by Jay Beale and found at http://www.bastille-linux.org, contains a number of scripts and programs that can be used to lock down a Linux system. It is also wise to use a password suite that enforces certain rules on the users’ passwords that make them hard to guess. The shadow password suite, now a default, requires a password to have at least five letters and to contain both upper- and lowercase numbers, as well as nonalphabetic characters.

When making a service accessible to the network, make sure to give it “least privilege”; don’t permit it to do things that aren’t required for it to work as designed. For example, you should make programs setuid to root or some other privileged account only when necessary. Also, if you want to use a service for only a very limited application, don’t hesitate to configure it as restrictively as your special application allows. For instance, if you want to allow diskless hosts to boot from your machine, you must provide Trivial File Transfer Protocol (TFTP) so that they can download basic configuration files from the /boot directory. However, when used unrestrictively, TFTP allows users anywhere in the world to download any world-readable file from your system. If this is not what you want, restrict TFTP service to the /boot directory (we’ll come back to this in Chapter 10). You might also want to restrict certain services to users from certain hosts, say from your local network. In Chapter 10, we introduce tcpd, which does this for a variety of network applications. More sophisticated methods of restricting access to particular hosts or services will be explored in Chapter 7.

Another important point is to avoid “dangerous” software. Of course, any software you use can be dangerous because software may have bugs that clever people might exploit to gain access to your system. Things like this happen, and there’s no complete protection against it. This problem affects free software and commercial products alike.[5] However, programs that require special privilege are inherently more dangerous than others because any loophole can have drastic consequences.[6] If you install a setuid program for network purposes, be doubly careful to check the documentation so that you don’t create a security breach by accident.

Another source of concern should be programs that enable login or command execution with limited authentication. The rlogin, rsh, and rexec commands are all very useful, but offer very limited authentication of the calling party. Authentication is based on trust of the calling hostname obtained from a nameserver (we’ll talk about these later), which can be faked. Today it should be standard practice to disable the r commands completely and replace them with the ssh suite of tools. The ssh tools use a much more reliable authentication method and provide other services, such as encryption and compression, as well.

You can never rule out the possibility that your precautions might fail, regardless of how careful you have been. You should therefore make sure that you detect intruders early. Checking the system logfiles is a good starting point, but the intruder is probably clever enough to anticipate this action and will delete any obvious traces he or she left. However, there are tools like tripwire, written by Gene Kim and Gene Spafford, that allow you to check vital system files to see if their contents or permissions have been changed. tripwire computes various strong checksums over these files and stores them in a database. During subsequent runs, the checksums are recomputed and compared to the stored ones to detect any modifications.

Finally, it’s always important to be proactive about security. Monitoring the mailing lists for updates and fixes to the applications that you use is critical in keeping current with new releases. Failing to update something such as Apache or OpenSSL can lead directly to system compromise. One fairly recent example of this was found with the Linux Slapper worm, which propagated using an OpenSSL vulnerability. While keeping up to date can seem a daunting and time-consuming effort, administrators who were quick to react and upgrade their OpenSSL implementations ended up saving a great deal of time because they did not have to restore compromised systems!



[1] The original spirit of which (see above) still shows on some occasions in Europe.

[2] The shell is a command-line interface to the Unix operating system. It’s similar to the DOS prompt in a Microsoft Windows environment, albeit much more powerful.

[3] The Ethernet FAQ at http://www.faqs.org/faqs/LANs/ethernet-faq/talks about this issue, and a wealth of detailed historical and technical information is available at Charles Spurgeon’s Ethernet web site at http://www.ethermanage.com/ethernet/ethernet.htm/.

[4] NCP is the protocol on which Novell file and print services are based.

[5] There have been commercial Unix systems (that you have to pay lots of money for) that came with a setuid root shell script, which allowed users to gain root privilege using a simple standard trick.

[6] In 1988, the RTM worm brought much of the Internet to a grinding halt, partly by exploiting a gaping hole in some programs, including the sendmail program. This hole has long since been fixed.

Get Linux Network Administrator's Guide, 3rd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.