BUY THIS BOOK
Add to Cart

Print Book $44.95


Safari Books Online

What is this?

Add to UK Cart

Print Book £31.95

What is this?

Looking to Reprint this content?

Exim: The Mail Transfer Agent
Exim: The Mail Transfer Agent The Mail Transfer Agent

By Philip Hazel
Price: $44.95 USD
£31.95 GBP

Cover | Table of Contents | Colophon


Table of Contents

Chapter 1: Introduction
Exim is a mail transfer agent (MTA) that can be run as an alternative to Sendmail on Unix systems. Exim is open-source software that is distributed under the GNU General Public License (GPL), and it runs on all the most popular flavors of Unix and many more besides. A number of Unix distributions now include Exim as their default MTA.
I wrote Exim for use on medium-sized servers with permanent Internet connections in a university environment, but it is now used in a wide variety of different situations, from single-user machines on dial-up connections to clusters of servers supporting millions of customers at some large ISP sites. The code is small (between 500 KB and 1.2 MB on most hardware, depending on the compiler and which optional modules are included), and its performance scales well.
The job of a mail transfer agent is to receive messages from different sources and to deliver them to their destinations, potentially in a number of different ways. Exim can accept messages from remote hosts using SMTP over TCP/IP, and as well as from local processes. It handles local deliveries to mailbox files or to pipes attached to commands, as well as remote SMTP deliveries to other hosts. Exim consists of support for the new IPv6 protocol in its TCP/IP functions, as well as for the current IPv4 protocol. It does not directly support UUCP, though it can be interfaced to other software that does, provided that UUCP "bang path" addressing is not required, because Exim supports only Internet-style, domain-based addressing.
Exim's configuration is flexible and can be set up to deal with a wide variety of requirements, including virtual domains and the expansion of mailing lists. Once you have grasped the general principles of how Exim works, you will find that the runtime configuration is straightforward and simple to set up. The configuration consists of a single file that is divided into a number of sections, and entries in each section that are keyword/value pairs. Regular expressions, compatible with Perl 5, are available for use in a number of options.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 2: How Internet Mail Works
The programs that users use to send and receive mail (often just called "mailers") are formally called mail user agents (MUAs). They are concerned with providing a convenient mail interface for users. They display incoming mail that is in users' mailboxes, assist the user in constructing messages for sending, and provide facilities for managing folders of saved messages. They are the "front end" of the mail system. Many different user agents can be installed, and can be simultaneously operational on a single computer, thereby providing a choice of different user interfaces. However, when an MUA sends a message, it does not take on the work of actually delivering it to the recipients. Instead, it sends it to a mail transfer agent (MTA), which may be running on the same host or on some local server.
Mail transfer agents do the job of transferring messages from one host to another, and, after they reach their destination hosts, of delivering them into user mailboxes or to processes that are managing user mailboxes. This job is complicated, and it would not be sensible for every MUA to contain all the necessary apparatus. The flow of data from a message's sender to its recipient is as shown in Figure 2-1. However, when an application program or script needs to send a mail message as part of some automatic activity, it normally calls the MTA directly without involving an MUA.
Figure 2-1: Message data flow
Only one MTA can be fully operational on a host at once, because only one program can be designated to receive incoming messages from other hosts. It has to be a privileged program in order to listen for incoming TCP/IP connections on the SMTP port and to be able to write to users' mailboxes. The choice of which MTA to run is made by the system administrator, whereas the choice of which MUA to run is made by the end user.
An MTA must be capable of handling many messages simultaneously. If it cannot deliver a message, it must send an error report back to the sender. An MTA must be able to cope with messages that cannot be immediately delivered, storing such messages on its local disk, and retrying periodically until it succeeds in delivering them or some configurable timeout expires. The most common causes of such delays are network connectivity problems and hosts that are down.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Different Types of MTA
The framework for mail delivery described earlier in this chapter is very general, and in practice there are many different kinds of MTA configuration that operate within it. At the simplest level, there are single hosts running in small offices or homes, each handling a few mailboxes in one domain, receiving incoming external messages from one ISP's mail server only, and sending all outgoing messages to the ISP for onward delivery. Many such hosts are not permanently connected to the Internet, but instead dial up from time to time to exchange mail with the server. In such an environment, the MTA does not have to be capable of doing full mail routing or complicated queue management.
Hosts that are permanently connected need not send everything via the same server, but can make use of the DNS to route outgoing messages more directly toward their final destinations. A single outgoing message may have several recipients, thus requiring copies to be sent to more than one remote server. This means that the MTA has to cope with messages where some of the addresses cannot be immediately delivered, and it must implement suitable retrying mechanisms for use with multiple servers. For incoming mail, the domain can be configured so that mail comes direct from anywhere on the Internet, without having to pass through an intermediate server.
An organization may not want to have all its local mailboxes on the same host. Even a small organization with just one domain may have users running their own desktop systems who want their mail delivered to them. The host running the "corporate" MTA has now become a hub, receiving mail from the world, and distributing it by user within its local network. It is common in such configurations for all outgoing mail from the network to pass through the hub. For security reasons, it is also common to configure the network router so that direct SMTP connections between the world and the workstations are not permitted.
Single organizations may support more than one domain, but the MTAs that support very large numbers of domains are usually those run by ISPs, and there are two common ways in which these are handled:
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Internet Message Standards
Electronic mail messages on the Internet are formatted according to RFC 822, which defines the format of a message as it is transferred between hosts, but not the protocol that is used for the exchange. The Simple Mail Transfer Protocol (SMTP) is used to transfer messages between hosts. This is defined in RFC 821, with additional material in RFC 1123 and several other RFCs that describe extensions. The SMTP address syntax is more restrictive than that of RFC 822, and requires that components of domain names consist only of letters, digits, and hyphens. Since any message may need to be transported using SMTP if its destination is not on the originating host, the format of all addresses is normally restricted to what RFC 821 permits.
All these RFCs are now very old, and revised versions are nearing completion at the time of writing (February, 2001). The revisions consolidate the material from the earlier RFCs, and incorporate current Internet practice.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
RFC 822 Message Format
A message consists of lines of text, and when it is in transit between hosts, each line is terminated by the character carriage return (ASCII code 13) immediately followed by linefeed (ASCII code 10), a sequence that is commonly written as CRLF. Within a host, messages are normally stored for convenience in RFC 822 format. Many applications use the local operating system's convention for line termination when doing this, but some use CRLF. The normal Unix convention is to terminate lines with a single linefeed character, without a preceding carriage return.
A message consists of a header and a body. The header contains a number of lines that are structured in specific ways as defined by RFC 822. The following examples are the header lines that are commonly shown to someone who is composing a message, and will be familiar to any email user:
From: Philip Hazel <ph10@exim.example>
To: My Readers <all@exim.book.example>,
    My Loyal Fans <fans@exim.example>
Cc: My Personal Assistant <cwbaft@exim.example>
Subject: How electronic mail works
An individual header line can be continued over several actual lines by starting the continuations with whitespace. The entire header section is terminated by a blank line. The body of the message then follows. In its simplest form, the body is unstructured text, but later RFCs (MIME, RFC 1521) define additional header lines that allow the body to be split up into several different parts. Each part can be in a different encoding, and there are standard ways of translating binary data into printable characters so that it can be transmitted using SMTP. This is the mechanism that is used for message "attachments."
RFC 822 permits many variations for addresses that appear in message header lines. For example:
To: caesar@rome.example.com
To: Julius Caesar <caesar@rome.example.com>
To: caesar@rome.example.com (Julius Caesar)
Text in parentheses anywhere in the line is a comment. This applies to all header lines whose structure is constrained by the RFC, not just those header lines that contain addresses. For example, in the following:
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
The Message "On the Wire"
A message that is transmitted between MTAs has several things added to it over and above what the composing user sees. In addition to the header section and the body, another piece of data called the envelope is transmitted immediately before the RFC 822 data, using the SMTP commands MAIL and RCPT. The envelope contains the sender address and one or more recipient addresses. These addresses are of the form <user@domain> without the additional textual information, such as the user's full name, that may appear in message header lines.
The deliveries done by the receiving MTA (either to local mailboxes or by passing the message on to other hosts) are based on the recipients listed in the envelope, not on the To: or Cc: header lines in the message. If any delivery fails, it is to the envelope sender address that the failure report is sent, not the address in the From: or Reply-to: header line.
The need for a separate envelope becomes obvious when considering a message with multiple recipients, whose mailboxes may be on several different hosts. The RFC 822 header lines normally list all the recipients, but in order to be delivered, the message has to be cloned into separate copies, one for each receiving host, and in each copy the envelope contains just those recipients whose mailboxes are on that host.
As well as an envelope, additional header lines are added by both the MUA and MTA before a message is transmitted to another host. Here is an example of a message "in transit," where the envelope lists only two of the three recipients. This example shows just the SMTP commands and data that the client sends, without the responses from the server:
MAIL FROM:<ph10@exim.example>
RCPT TO:<fans@exim.example>
RCPT TO:<cwbaft@exim.example>
DATA
Received: from ph10 by draco.exim.example with local (Exim 3.22 #1)
        id 14Tli0-000501-00;
        Fri, 16 Feb 2001 14:18:05 +0000
From: Philip Hazel <ph10@exim.example>
To: My Readers <all@exim.book.example>,
    My Loyal Fans <fans@exim.example>
Cc: My Personal Assistant <cwbaft@exim.example>
Subject: How electronic mail works
Date: Fri, 16 Feb 2001 14:18:05 +0000
Message-ID: <Pine.SOL.3.96.990117111343.19032A-100000@
  draco.exim.example>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Hello,
  If you want to know about Internet mail, look at chapter 2.
.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Summary of the SMTP Protocol
SMTP is a simple command-reply protocol. The client host sends a command to the server, and then waits for a reply before proceeding to the next command. Replies always start with a three-digit decimal number; for example:
250 Message accepted
The text is usually information intended for human interpretation, though there are some exceptions, where the number encodes the type of response. The first digit is the most important, and is always one of those shown in Table 2-1.
Table 2-1: SMTP Response Codes
CodeMeaning
2xx The command was successful
3xx Additional data is required for the command
4xx The command suffered a temporary error
5xx The command suffered a permanent error
The second and third digits give additional information about the response, but an MTA need not pay any attention to them. Exim, for example, operates entirely on the first digit of SMTP response codes. Replies may consist of several lines of text. For all but the last of them, the code is followed by a hyphen; in the last line it is followed by whitespace. For example:
550-Host is not on relay list
550 Relaying prohibited by administrator
When a client connects to a server's SMTP port (port 25), it must wait for an initial success response before proceeding. Some servers include the identity of the software they are running (and maybe other information) in the response, but none of this is actually required. Others send a minimal response such as:
220 ESMTP Ready
The client initializes the session by sending an EHLO (extended hello) command, which gives its own name. For example:
EHLO client.example.com
Unfortunately, there are many MTAs in use that are misconfigured, either accidentally or deliberately, such that they do not give their correct name in the
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Forgery
It is trivial to forge unencrypted mail. In general, MTAs are "strangers" to each other, so there is no way a receiving MTA can authenticate the contents of the envelope or the message itself. All it can do is log the IP address of the sending host, and include it in the Received: line that it adds to the message.
Unsolicited junk mail (spam) usually contains some forged header lines. You need to be aware of this if you ever have to investigate the origin of such mail. If a message contains a header line such as:
Received: from foobar.com.example ([10.9.8.7])
        by podunk.edu.example (8.9.1/8.9.1) with SMTP id DAA00447;
        Tue, 6 Mar 2001 03:21:43 -0500 (EST)
it does not mean that the FooBar company or the University of Podunk are necessarily involved at all; the header may simply have been inserted by the spam perpetrator to mislead. The only Received: headers you can count on are those at the top of the message that were added by MTAs running on hosts whose administrators you trust. Once you pass these Received: headers, those below them, even if they appear to relate to a reputable organization such as an ISP, may be forged.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Authentication and Encryption
The original SMTP protocol had no facilities for authenticating clients, nor for encrypting messages as they were transmitted between hosts. As the Internet expanded, it became clear that these features were needed, and the protocol has been extended to allow for them. However, the vast majority of Internet mail is still transmitted between unauthenticated hosts, over unencrypted connections. For this reason, we won't go into any details in this introductory chapter, but there is some discussion in Chapter 15, regarding the way Exim handles these features.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Routing a Message
The most fundamental part of any MTA is the apparatus for deciding where to send a message. There may be many recipients, both local and remote. This means that a number of different copies may need to be made and sent to different destinations. Some domains may be known to the local host and processed specially; the remainder normally causes copies of the message to be sent to remote hosts, which may either be the final destinations or intermediate hosts.
There are two distinct types of address: those for which the local part is used when deciding how to deliver the message, and those for which only the domain is relevant. Typically, when a domain refers to a remote host, the local part of the address plays no part in the routing process, but if the domain is the name of the local host, the local part is all-important. The steps that an MTA has to perform in order to handle a message are as follows, though they are not necessarily done in this order:
  • First, it has to decide what deliveries to do for each recipient address. In order to do this, it must:
    • Process addresses that contain domains for which this host is the ultimate destination. These are often called "local addresses." Processing may involve expanding aliases into lists of replacement addresses, handling users' .forward files, dealing with mailing lists, and checking that the remaining local parts refer to existing local user mailboxes.
    • Process the nonlocal addresses for which there is local routing knowledge (for example, domains for which the host is a mail hub or firewall) to determine which of its clients' hosts these addresses should be sent to.
    • For the remaining addresses, those for which there is no local knowledge, look up destination hosts in the DNS. The details of how this is done are given in Section 2.11 later in this chapter. Successful routing produces a list of one or more remote hosts for each address.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Checking Incoming Mail
Some MTAs check the validity of local addresses during the SMTP transaction. If an incoming message has an incorrect local part, the RCPT command that transfers that part of the envelope is rejected by giving an error reponse. This means that the sending MTA retains control of the message for that recipient, and is the one that generates the bounce message that goes back to the sender. The benefit of doing this checking is that it stops such undeliverable messages from ever getting into the local host. However, receiving a bounce message from an MTA that is not at the site they were mailing to confuses some users, and makes them think that something is broken. "How can the local mailer daemon know that this is an invalid address at the remote site?" they ask.
The alternative approach that is adopted by some MTAs is to accept messages without checking the recipient addresses, and do the checking later. This has the benefit of minimizing the duration of the SMTP transaction, and for invalid addresses, the bounce messages are what the users intuitively expect, and they can be made to contain helpful information about finding correct mail addresses. The disadvantage is that undeliverable messages whose envelope senders are also invalid give rise to undeliverable bounce messages that have to be sorted out by the postmaster. Sadly, many spam messages are sent out with invalid envelope senders, leading to more and more administrators configuring their MTAs to implement the former behavior.
Exim can be configured to behave in either of these two ways, and the behavior can be made conditional on the domain of the sender address. For example, all addresses from within a local environment can be accepted, and unknown ones passed to a program that sends back a helpful message, while unknown addresses from the outside can be rejected in the SMTP protocol.
Not all MTAs check the validity of envelope sender addresses. These can be invalid for a number of reasons, such as:
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Overview of the DNS
The DNS is a worldwide, distributed database that holds various kinds of data indexed by keys that are called domain names. Here is a very brief summary of the facilities that are relevant to mail handling. The data is held in units called records, each containing a number of items, of which the following are relevant to applications that use the DNS:
            <domain name>  
            <record type>  
            <type-specific data>
         
For example, for the record:
www.web.example.  A  10.8.6.4
the domain name is www.web.example, the record type is "A" (for "address"), and the data is 10.8.6.4. Address records like this are used for finding the IP addresses of hosts from their names, and are probably the most common type of DNS record.
In the world of the DNS, a complete, fully qualified domain name is always shown with a terminating dot, as in the previous example. Incomplete domain names, without the trailing dot, are relative to some superior domain. Unfortunately, there is confusion because some applications that interact with the DNS do not show or require the trailing dot. In particular, domains in email addresses must not include it, because that is contrary to RFC 821/822 syntax.
The present Internet addressing scheme, which uses 32-bit addresses and is known as IPv4, is going to be replaced by a new scheme called IPv6, which uses 128-bit addresses. Support for IPv6 is gradually beginning to appear in operating systems and application software. Two different DNS record types are currently used for recording IPv6 addresses, which are normally written in hexadecimal, using colon separators. The AAAA record, which is a direct analogue to the A record, was defined first. For example:
ipv6.example.  AAAA  5f03:1200:836f:0a00:000a:0800:200a:c031
However, it has been realized that a more flexible scheme, in which prefix portions of IPv6 addresses can be held separately, is preferable, because it makes aggregation and renumbering easier. For this reason, another record type, A6, has been defined and is expected in due course to supersede the AAAA type. The previous example could be converted into a single A6 record such as this:
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
DNS Records Used for Mail Routing
The domain in a mail address need not correspond to a hostname. For example, an organization might use the domain plc.example.com for all its email, but handle it with hosts called mail-1.plc.example.com and mail-2.plc.example.com. This kind of flexibility is obtained by making use of mail exchange (MX) records in the DNS. An MX record maps a mail domain to a host that is registered as handling mail for that domain, with a preference value. There may be any number of MX records for a domain, and when a name server is queried, it returns all of them. For example:
hermes.example.com.  MX  5  green.csi.example.com.
hermes.example.com.  MX  7  sw3.example.com.
hermes.example.com.  MX  7  sw4.example.com.
shows three hosts that handle mail for hermes.example.com. The preference values can be thought of as distances from the target; the smaller the value, the more preferable the corresponding host, so in this example, green.csi.example.com is the most preferred. An MTA that is deliverying mail for hermes.example.com first tries to deliver to green.csi.example.com; if that fails, it tries the less preferred hosts in order of their preference values. It is only the numerical order of the preferences that is used; the absolute values do not matter. When there are MX records with identical preference values (as in the previous example), they are ordered randomly before they are used.
Before an MTA can make use of the list of hosts it has obtained from MX records, it first has to find the IP addresses for the hosts. It does this by looking up the corresponding address records (A records for IPv4, and AAAA or A6 records for IPv6). For the previous example, there might be the following address records:
green.csi.example.com.  A  192.168.8.57
sw3.example.com.        A  192.168.8.38
sw4.example.com.        A  192.168.8.44
In practice, if a name server already has an address record for any host in an MX list that it is returning, it sends the address record along with the MX records. In many cases, this saves an additional DNS query.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Related DNS Records
Two other kinds of DNS records are useful in connection with mail. PTR ("pointer") records map IP addresses to names via special zones called in-addr.arpa for IPv4 addresses, and ip6.int or ip6.arpa for IPv6 addresses. PTR records allow the reverse of a normal host lookup: given an IP address, PTR records allow you to find out the corresponding hostname. The name of a PTR record consists of the IP address followed by one of the special domains. However, for the in-addr.arpa and ip6.int domains, the components of the address are reversed to allow for DNS delegation of parts of an IP network. For the address 192.168.8.57, the PTR record would be as follows:
57.8.168.192.in-addr.arpa.  PTR  green.csi.example.com.
This registers that the name of the host that has the IP address 192.168.8.57 is green.csi.example.com. For IPv6 addresses in the ip6.int domain, the components that are reversed are the hexadecimal digits. For the address:
5f03:1200:836f:0a00:000a:0800:200a:c031
the name of the PTR record is:
1.3.0.c.a.0.0.2.0.0.8.0.a.0.0.0.0.0.a.0.f.6.3.8.0.0.2.1.3.0.f.5.ip6.int.
Not only is this rather clumsy to notate, it also has the disadvantage that DNS zone breaks are not possible at arbitrary points in the 128-bit address. For this reason, at the time A6 records were introduced for name-to-address lookups, an alternative format for IPv6 PTR records was defined for use with the domain ip6.arpa. In this form, part of the domain name is a binary number with an implied component break between each binary digit. For convenience, in textual versions of the record, the number is given in conventional notation without having to be reversed. In this new formulation, the name of the PTR record in the previous IPv6 address is:
\[x5f031200836f0a00000a0800200ac031].ip6.arpa.
where the backslash and brackets indicate an encoding of a binary value.
PTR records do not have to match the corresponding address record. In the example in the previous section, the address record:
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Common DNS Errors
These are a number of common mistakes that are made by DNS administrators (who are usually known as "hostmasters"), shown in the following list. All except the first prevent mail from being delivered:
  • MX records point to aliases instead of canonical names. That is, the domains on the righthand side of MX records are the names of CNAME records instead of A, A6, or AAAA records. This should not prevent mail from working, but it is inefficient, and not strictly correct.
  • MX records point to nonexistent hosts; that is, to names that have no corresponding A, A6, or AAAA record.
  • MX records contain IP addresses on the righthand side instead of hostnames. This error is unfortunately becoming more widespread, abetted by the fact that some MTAs, in violation of RFC 1034, support the usage. Exim does not do so by default, but does have an option to enable this unrecommended, nonstandard behavior.
  • MX records do not contain preference values.
Some broken name servers give a server error when asked for a nonexistent MX record. This prevents mail from being delivered because an MTA is permitted to search for an address record only if it is sure there are no MX records. In the case of a server error, the MTA does not know this. Similar server errors have been seen in cases where a preference value has been omitted from an MX record. More robust name servers check records when loading their zones, and generate an error if any contain bad data such as this.
Occasionally, the DNS appears to be giving different answers to identical queries. In the context of mail, this causes some messages to be rejected with "unknown domain" errors, whereas other messages to the same domain are delivered normally. The most common cause of this kind of behavior is that the name servers for the zone are out of step. If you suspect this, you can check by directing a DNS query to a specific name server. The first step is to find the relevant name servers by looking for the zone's NS records. To find the name servers for the zone
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Role of the Postmaster
Postmaster is the name given to the person who is in charge of administering an MTA. He or she should be familiar with the software and its configuration, and should regularly monitor its behavior. If there are local users of the system, they should be able to contact the postmaster about any mail problems. If the MTA sends or receives mail to or from the Internet at large, people on other hosts must also be able to contact the postmaster.
The traditional way that this is done is by maintaining an alias address postmaster@your.domain, which redirects to the person who is currently performing the postmaster role. Indeed, the RFCs state that postmaster must always be supported as a case-insensitive local name.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 3: Exim Overview
In the previous chapter, the job of an MTA is described in general terms. In this chapter, we explain how Exim is organized to do this job, and the overall way in which it operates. Then in the next chapter, we cover the basics of Exim administration before launching into more details about the configuration.
Exim is designed for use on a network where most messages can be delivered at the first attempt. This is true for most of the time over a large part of the Internet. Measurements taken in the author's environment (a British university) indicate that well over 90 percent of messages are delivered almost immediately under normal conditions. This means that there is no need for an elaborate centralized queuing mechanism through which all messages pass. When a message arrives, an immediate delivery attempt is likely to be successful; only for a small number of messages is it necessary to implement a holding and retrying mechanism.
Therefore, although it is possible to configure Exim otherwise, the normal action is to try an immediate delivery as soon as a message has been received. In many cases this is successful, and nothing more is needed to process the message. Nevertheless, some precautions must be taken to avoid system overload in times of stress. For example, if the system load rises above some threshold, or if there are a large number of simultaneous incoming SMTP connections, immediate delivery may be temporarily disabled. In these events, incoming messages wait on Exim's queue and are delivered later.
All operations are performed by a single Exim binary, which operates in different ways, depending on the arguments with which it is called. Although receiving and delivering messages are treated as entirely separate operations, the code for determining how to deliver to a specific address is needed in both cases, because during message reception, addresses are verified by checking whether it would be possible to deliver to them. For example, Exim verifies a remote sender address by looking up the domain in the DNS in exactly the same way as when setting up a delivery to that address.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Exim Philosophy
Exim is designed for use on a network where most messages can be delivered at the first attempt. This is true for most of the time over a large part of the Internet. Measurements taken in the author's environment (a British university) indicate that well over 90 percent of messages are delivered almost immediately under normal conditions. This means that there is no need for an elaborate centralized queuing mechanism through which all messages pass. When a message arrives, an immediate delivery attempt is likely to be successful; only for a small number of messages is it necessary to implement a holding and retrying mechanism.
Therefore, although it is possible to configure Exim otherwise, the normal action is to try an immediate delivery as soon as a message has been received. In many cases this is successful, and nothing more is needed to process the message. Nevertheless, some precautions must be taken to avoid system overload in times of stress. For example, if the system load rises above some threshold, or if there are a large number of simultaneous incoming SMTP connections, immediate delivery may be temporarily disabled. In these events, incoming messages wait on Exim's queue and are delivered later.
All operations are performed by a single Exim binary, which operates in different ways, depending on the arguments with which it is called. Although receiving and delivering messages are treated as entirely separate operations, the code for determining how to deliver to a specific address is needed in both cases, because during message reception, addresses are verified by checking whether it would be possible to deliver to them. For example, Exim verifies a remote sender address by looking up the domain in the DNS in exactly the same way as when setting up a delivery to that address.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Exim's Queue
The word queue is used for the set of messages that Exim has under its control at any one time, because this word is common in the context of mail transfer. However, Exim's queue is normally treated as a collection of messages with no implied ordering, more like a "pool" than a "queue." Furthermore, Exim does not maintain separate queues for different domains or different remote hosts.
There is just a single collection of messages awaiting delivery, each of which may have several recipients. You can list the messages on the queue by running the command:
exim -bp
assuming that your path is set up to contain the directory where the Exim binary is located. Messages that are not delivered immediately on arrival are picked up later by queue runner processes that scan the entire queue and start a delivery process for each message in turn. A queue runner process waits for each delivery process to complete before starting the next one.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Receiving and Delivering Messages
Message reception and message delivery are two entirely separate operations in Exim, and their only connection is that Exim normally tries to deliver a message as soon as it has received it. Receiving a message consists of writing it to local spool files ("putting it on the queue") and checking that the files have been successfully written before acknowledging reception to the sending host or local process. There is only one copy of each message, however many recipients it has, and the collection of spool files is the queue; there are no additional files or in-memory lists of messages.
A delivery operation gets all its data from the spool files. Each attempt at delivering a message processes every undelivered recipient address afresh. Exim does not normally retain previous alias, forwarding, or mailing list expansions from one delivery attempt to another.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Exim Processes
Parallelism is obtained by the use of multiple processes, but one important aspect of Exim's design is that there is no central process that has overall responsibility for coordinating Exim's actions, and therefore there is no concept of starting or stopping Exim as a whole. Exim processes can be started at any time by other processes; for example, user agents are always able to start Exim processes in order to send messages. Such processes perform a single task and then exit. Most processes are therefore short-lived, but Exim does make use of long-running daemon processes for two purposes:
  1. To listen on the SMTP port for incoming TCP/IP connections. On receiving such a connection, the listener forks a new process to deal with it. An upper limit to the number of simultaneously active reception processes can be set. When the limit is reached, additional SMTP connections are refused.
  2. To start up queue runner processes at fixed intervals. These scan the pool of waiting messages (by default in an arbitrary order) and initiate fresh delivery attempts. A message may be on the queue because a previous delivery attempt failed, or because no delivery attempt was initiated when the message was received. Each delivery attempt processes a single message and runs in its own process, and the queue runner waits for it to complete before moving on to the next message. A limit may be set for the number of simultaneously active queue runner processes run by a daemon.
A single daemon process can be used to perform both these functions, and this is the most common configuration. However, it is possible to run Exim without using a daemon at all; inetd can be used to accept incoming SMTP calls and start up an Exim process for each one, and queue runner processes can be started by cron or some other means. However, in these cases Exim has no control over how many such processes are run, so if you are worried about system overload, you must control the number of processes yourself.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Coordination Between Processes
Processes for receiving and delivering messages are for the most part entirely independent. The small amount of coordination that is needed is achieved by sharing files. Minimizing synchronization and serialization requirements between processes helps Exim to scale well. Apart from the messages themselves, the shared data consists of a number of files containing "hints" about mail delivery. For example, if a remote host cannot be contacted, the time of the failure and the suggested next time to try that host are recorded. Any delivery process that has a message for that host will read the hint and refrain from trying the delivery if the retry time has not been reached. This does not affect delivery of the same message to other hosts when there is more than one recipient address.
Because the coordinating data is treated as a collection of hints, it is not a major disaster if any or all of it is lost; there may be a period of less optimal mail delivery, but that is all. Consequently, the code that maintains the hints can be quite simple because it does not have to be made robust against unusual circumstances.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
How Exim Is Configured
Configuration information, supplied by the administrator, is used at two different times: one configuration file is used when building the Exim binary, and another is read whenever the binary is run. Most options can be specified in only one of these files; that is, they either control how the binary is built, or they modify its behavior at runtime, but there are a few build-time options that set defaults for runtime behavior. The sources of Exim's configuration information are shown in Figure 3-1.
Figure 3-1: Exim configuration
The build-time options are of three kinds:
  • Those that specify the inclusion of optional code; for example, to support specific database lookups such as LDAP, or to support IPv6.
  • Those that specify fixed values that cannot be changed at runtime; for example, the mode of message files in Exim's spool directory.
  • Those that specify default values for certain runtime options; for example, the location of Exim's log files.
The process of building Exim from source is described in detail in Chapter 22. Here, we consider the runtime configuration. This is controlled by a single text file, often called something like /etc/exim.conf. You can find out the actual name by running the following command:
exim -bP configure_file
On a system where Exim is fully installed as a replacement for Sendmail, one or both of the paths /usr/lib/sendmail or /usr/sbin/sendmail is a symbolic link to the Exim binary. Therefore, any MUA, program, or script that attempts to send a message by calling Sendmail actually calls Exim.
Whenever Exim is executed, it starts by reading its runtime configuration file. A large number of settings can be present, but for any one installation only a few are normally used. The data from the file is held in main memory while an Exim process is running. For this reason, if you change the file, you have to tell the Exim daemon to reload it. This is done by sending the daemon a
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
How Exim Delivers Messages
Exim's configuration determines how it processes addresses; this processing involves finding information about the destinations of a message and how to transport it to those destinations. In this and the following sections, we discuss how the configuration that you set up controls what happens.
There are many different ways an address can be processed. For example, looking up a domain in the DNS involves a completely different way of processing from looking up a local part in an alias file, and delivering a message using SMTP over TCP/IP has very little in common with appending it to a mailbox file. There are separate blocks of code in Exim for doing the different kinds of processing, and each is separately and independently configurable. The word driver is used as the general term for one of these code blocks. In many cases, when you specify that a particular driver is to be used, you need only give one or two parameters for it. However, most drivers have a number of other options whose defaults can be changed to vary their behavior.
There are four different kinds of drivers. Three of them are concerned with handling addresses and delivering messages, and are called directors, routers, and transports. The fourth kind of driver handles SMTP authentication and is described in Chapter 15.
Transports are the components of Exim that actually deliver messages by writing them to files, or to pipes, or over SMTP connections. Directors and routers are very similar in that their job is to process addresses and decide what deliveries are to take place. The difference between them is in the kinds of address that they handle; directors handle local addresses and routers handle remote addresses. As Exim has evolved, the original differences in concept between directors and routers have diminished, and it may come about that they are merged in some future release. For the moment, however, a distinction remains.
Before going into more detail, we take a brief look at the way drivers are used as a message makes its way through the system. Exim has to decide whether each address is to be delivered on the local host or to a remote one, then it has to choose the right form of transport for each address (appending to a user's mailbox, for instance, or connecting to another host via SMTP), and finally it has to invoke those transports. For example, in a typical configuration, a message addressed to
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Local and Remote Addresses
There are two distinct types of mail address: those for which the local part is used when deciding how to deliver the message, and those for which only the domain is relevant. Typically, when a domain refers to a remote host, the local part of the address plays no part in the routing process, but if the domain is the name of the local host, the local part is usually used in determining where to deliver the message. This is not a hard and fast rule (a small company might accept mail for any local part in a single mailbox), but it forms the basis of the distinction between directors and routers.
The first thing Exim does when processing an address is to determine whether it should be handled by the directors or by the routers. An Exim configuration normally contains definitions of a number of directors and at least one router, though there may be any number of either. If the domain is listed in the configuration as a local domain, the address is processed by the directors and is called a local address. Otherwise it is processed by the routers and is called a remote address.
Exim decides whether a domain is local by checking the local_domains option, which contains a colon-separated list of patterns. If it is not set, the name of the local host is used as the only local domain. Otherwise, it may contain various types of patterns, of which the most common are shown in this example:
local_domains = tiber.rivers.example:\
                *.cities.example:\
                dbm;/usr/exim/domains
The first item in the list is a single domain name, tiber.rivers.example, while the second is a simple pattern, matching all domains that end in .cities.example. The third item is a reference to an external file, /usr/exim/domains, which is a DBM-keyed file. This type of item is useful when a host is handling a very large number of local domains. We discuss DBM files and this kind of lookup item in more detail later.
Notice the use of backslashes for continuing the option value over several lines. This is a general feature of Exim's configuration file; any line can be continued in this way. Whitespace at the start of continuation lines is ignored.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Processing an Address
After it has decided whether an address is local or remote, Exim offers it to each configured director or router (as appropriate) in turn, in the order in which they are defined, until one of them is able to deal with it. The order in which directors and routers are defined in the configuration file is therefore important. The process of directing a local address is illustrated in Figure 3-2; a similar process happens using the routers for a remote address.
Figure 3-2: Directing a local address
A director that successfully handles an address may add that address to a queue for a particular transport. Alternatively, it may generate one or more "child" addresses that are added to the message's address list and processed in their own right, with the original address no longer playing any part. This is what happens when a local part matches an entry in an alias list, or when a user's .forward file is activated.
A successful router, on the other hand, can only add the address to a queue for a transport, or modify the domain and pass it on to the next router. It cannot generate "child" addresses. When a director or a router cannot handle an address, it is said to decline. If every director or router declines, the address cannot be handled at all, and delivery fails.
Figure 3-3: Routing and directing
The way addresses are handled by directors and routers is illustrated in Figure 3-3. (The line labeled "local after all" is a special case that is discussed in Section 3.11.4, later in this chapter.) All the addresses in a message, and any that are generated from them (for example, by aliasing), are processed by the directors and routers before any deliveries take place from the transport queues. Any router or director can queue an address for any transport; directors are not restricted to local transports, nor routers to remote ones.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
A Simple Example
To help clarify the mechanisms described earlier, an example of a simple message delivery is presented here. The scenario is a host called simple.example, where the hostname is the only local mail domain. The host is using a simple Exim configuration file that supports aliases, user-forward files, delivery to local users' mailboxes, and remote SMTP delivery. The relevant portions of the configuration are quoted here. Suppose a user of this host has sent a message addressed to one local and one remote recipient:
postmaster@simple.example
friend@another.example
At the start of delivery, Exim's list of addresses to process is initialized with the two original recipients, and its first job is to work through this list, deciding what to do for each address. For postmaster@simple.example, the domain is local, so it is passed to the first defined director, whose configuration is as follows:
system_aliases:
  driver = aliasfile
  file = /etc/aliases
  search_type = lsearch
The first line, terminated by a colon, is the name for this particular director instance, chosen by the system administrator. Each driver of a particular type (director, router, or transport) must have a distinct name. However, names of driver instances can be the same as the names of the drivers themselves; you can have the following:
aliasfile:
  driver = aliasfile
  file = /etc/aliases
  search_type = lsearch
if you want to, but some people find this usage confusing. The second configuration line specifies which kind of director this is (or, to put it another way, it chooses which block of director code to run), and the remaining two lines are options for the director.
The aliasfile director handles an address by looking up the local part in an alias list, and the options control how the lookup is done. In this case, the list is in the file /etc/aliases, and a linear search ("lsearch") is required. This expects each line of the file to contain an alias name, optionally terminated by a colon, followed by the list of replacement addresses for the alias, which may be continued onto subsequent lines by starting them with whitespace. A comma is used to separate addresses in the list. For example:
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Complications While Directing and Routing
Content preview·Buy PDF