BUY THIS BOOK
Add to Cart

Print Book $74.99


Add to Cart

Print+PDF $97.49

Add to Cart

PDF $55.99

Safari Books Online

What is this?

Add to UK Cart

Print Book £53.50

What is this?

Looking to Reprint or License this content?


Secure Programming Cookbook for C and C++
Secure Programming Cookbook for C and C++ Recipes for Cryptography, Authentication, Input Validation & More By John Viega, Matt Messier
July 2003
Pages: 790

Cover | Table of Contents | Colophon


Table of Contents

Chapter 1: Safe Initialization
Robust initialization of a program is important from a security standpoint, because the more parts of the program's environment that can be validated (e.g., input, privileges, system parameters) before any critical code runs, the better you can minimize the risks of many types of exploits. In addition, setting a variety of operating parameters to a known state will help thwart attackers who run a program in a hostile environment, hoping to exploit some assumption in the program regarding an external resource that the program accesses (either directly or indirectly). This chapter outlines some of these potential problems, and suggests solutions that work towards reducing the associated risks.
Attackers can often control the value of important environment variables, sometimes even remotely—for example, in CGI scripts, where invocation data is passed through environment variables.
You need to make sure that an attacker does not set environment variables to malicious values.
Many programs and libraries, including the shared library loader on both Unix and Windows systems, depend on environment variable settings. Because environment variables are inherited from the parent process when a program is executed, an attacker can easily sabotage variables, causing your program to behave in an unexpected and even insecure manner.
Typically, Unix systems are considerably more dependent on environment variables than are Windows systems. In fact, the only scenario common to both Unix and Windows is that there is an environment variable defining the path that the system should search to find an executable or shared library (although differently named variables are used on each platform). On Windows, one environment variable controls the search path for finding both executables and shared libraries. On Unix, these are controlled by separate environment variables. Generally, you should not specify a filename and then rely on these variables for determining the full path. Instead, you should always use absolute paths to known locations.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Sanitizing the Environment
Attackers can often control the value of important environment variables, sometimes even remotely—for example, in CGI scripts, where invocation data is passed through environment variables.
You need to make sure that an attacker does not set environment variables to malicious values.
Many programs and libraries, including the shared library loader on both Unix and Windows systems, depend on environment variable settings. Because environment variables are inherited from the parent process when a program is executed, an attacker can easily sabotage variables, causing your program to behave in an unexpected and even insecure manner.
Typically, Unix systems are considerably more dependent on environment variables than are Windows systems. In fact, the only scenario common to both Unix and Windows is that there is an environment variable defining the path that the system should search to find an executable or shared library (although differently named variables are used on each platform). On Windows, one environment variable controls the search path for finding both executables and shared libraries. On Unix, these are controlled by separate environment variables. Generally, you should not specify a filename and then rely on these variables for determining the full path. Instead, you should always use absolute paths to known locations.
Certain variables expected to be present in the environment can cause insecure program behavior if they are missing or improperly set. Make sure, therefore, that you never fully purge the environment and leave it empty. Instead, variables that should exist should be forced to sane values or, at the very least, treated as highly suspect and examined closely before they're used. Remove any unknown variables from the environment altogether.
The standard C runtime library defines a global variable,
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Restricting Privileges on Windows
Your Windows program runs with elevated privileges, such as Administrator or Local System, but it does not require all the privileges granted to the user account under which it's running. Your program never needs to perform certain actions that may be dangerous if users with elevated privileges run it and an attacker manages to compromise the program.
When a user logs into the system or the service control manager starts a service, a token is created that contains information about the user logging in or the user under which the service is running. The token contains a list of all of the groups to which the user belongs (the user and each group in the list is represented by a Security ID or SID), as well as a set of privileges that any thread running with the token has. The set of privileges is initialized from the privileges assigned by the system administrator to the user and the groups to which the user belongs.
Beginning with Windows 2000, it is possible to create a restricted token and force threads to run using that token. Once a restricted token has been applied to a running thread, any restrictions imposed by the restricted token cannot be lifted; however, it is possible to revert the thread back to its original unrestricted token. With restricted tokens, it's possible to remove privileges, restrict the SIDs that are used in access checking, and deny SIDs access. The use of restricted tokens is more useful when combined with the CreateProcessAsUser( ) API to create a new process with a restricted token that cannot be reverted to a more permissive token.
Beginning with Windows .NET Server 2003, it is possible to permanently remove privileges from a process's token. Once the privileges have been removed, they cannot be added back. Any new processes created by a process running with a modified token will inherit the modified token; therefore, the same restrictions imposed upon the parent process are also imposed upon the child process. Note that modifying a token is quite different from creating a restricted token. In particular, only privileges can be removed; SIDs can be neither restricted nor denied.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Dropping Privileges in setuid Programs
Your program runs setuid or setgid (see Section 1.3.3 for definitions), thus providing your program with extra privileges when it is executed. After the work requiring the extra privileges is done, those privileges need to be dropped so that an attacker cannot leverage your program during an attack that results in privilege elevation.
If your program must run setuid or setgid, make sure to use the privileges properly so that an attacker cannot exploit other possible vulnerabilities in your program and gain these additional privileges. You should perform whatever work requires the additional privileges as early in the program as possible, and you should drop the extra privileges immediately after that work is done.
While many programmers may be aware of the need to drop privileges, many more are not. Worse, those who do know to drop privileges rarely know how to do so properly and securely. Dropping privileges is tricky business because the semantics of the system calls to manipulate IDs for setuid/setgid vary from one Unix variant to another—sometimes only slightly, but often just enough to make the code that works on one system fail on another.
On modern Unix systems, the extra privileges resulting from using the setuid or setgid bits on an executable can be dropped either temporarily or permanently. It is best if your program can do what it needs to with elevated privileges, then drop those privileges permanently, but that's not always possible. If you must be able to restore the extra privileges, you will need to be especially careful in your program to do everything possible to prevent an attacker from being able to take control of those privileges. We strongly advise against dropping privileges only temporarily. You should do everything possible to design your program such that it can drop privileges permanently as quickly as possible. We do recognize that it's not always possible to do—the Unix passwd command is a perfect example: the last thing it does is use its extra privileges to write the new password to the password file, and it cannot do it any sooner.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Limiting Risk with Privilege Separation
Your process runs with extra privileges granted by the setuid or setgid bits on the executable. Because it requires those privileges at various times throughout its lifetime, it can't permanently drop the extra privileges. You would like to limit the risk of those extra privileges being compromised in the event of an attack.
When your program first initializes, create a Unix domain socket pair using socketpair( ), which will create two endpoints of a connected unnamed socket. Fork the process using fork( ) , drop the extra privileges in the child process, and keep them in the parent process. Establish communication between the parent and child processes. Whenever the child process needs to perform an operation that requires the extra privileges held by the parent process, defer the operation to the parent.
The result is that the child performs the bulk of the program's work. The parent retains the extra privileges and does nothing except communicate with the child and perform privileged operations on its behalf.
If the privileged process opens files on behalf of the unprivileged process, you will need to use a Unix domain socket, as opposed to an anonymous pipe or some other other interprocess communication mechanism. The reason is that only Unix domain sockets provide a means by which file descriptors can be exchanged between the processes after the initial fork( ).
In Recipe 1.3, we discussed setuid, setgid, and the importance of permanently dropping the extra privileges resulting from their use as quickly as possible to minimize the window of vulnerability to a privilege escalation attack. In many cases, the extra privileges are necessary for performing some initialization or other small amount of work, such as binding a socket to a privileged port. In other cases, however, the work requiring extra privileges cannot always be restricted to the beginning of the program, thus requiring that the extra privileges be dropped only temporarily so that they can later be restored when they're needed. Unfortunately, this means that an attacker who compromises the program can also restore those privileges.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Managing File Descriptors Safely
When your program starts up, you want to make sure that only the standard stdin , stdout, and stderr file descriptors are open, thus avoiding denial of service attacks and avoiding having an attacker place untrusted files on special hardcoded file descriptors.
On Unix, use the function getdtablesize( ) to obtain the size of the process's file descriptor table. For each file descriptor in the process's table, close the descriptors that are not stdin, stdout, or stderr, which are always 0, 1, and 2, respectively. Test stdin, stdout, and stderr to ensure that they're open using fstat( ) for each descriptor. If any one is not open, open /dev/null and associate with the descriptor. If the program is running setuid, stdin, stdout, and stderr should also be closed if they're not associated with a tty, and reopened using /dev/null.
On Windows, there is no way to determine what file handles are open, but the same issue with open descriptors does not exist on Windows as it does on Unix.
Normally, when a process is started, it inherits all open file descriptors from its parent. This can be a problem because the size of the file descriptor table on Unix is typically a fixed size. The parent process could therefore fill the file descriptor table with bogus files to deny your program any file handles for opening its own files. The result is essentially a denial of service for your program.
When a new file is opened, a descriptor is assigned using the first available entry in the process's file descriptor table. If stdin is not open, for example, the first file opened is assigned a file descriptor of 0, which is normally reserved for stdin. Similarly, if stdout is not open, file descriptor 1 is assigned next, followed by stderr's file descriptor of 2 if it is not open.
The only file descriptors that should remain open when your program starts are the
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Creating a Child Process Securely
Your program needs to create a child process either to perform work within the same program or, more frequently, to execute another program.
On Unix, creating a child process is done by calling fork( ) . When fork( ) completes successfully, a nearly identical copy of the calling process is created as a new process. Most frequently, a new program is immediately executed using one of the exec*( ) family of functions (see Recipe 1.7). However, especially in the days before threading, it was common to use fork( ) to create separate "threads" of execution within a program.
If the newly created process is going to continue running the same program, any pseudo-random number generators (PRNGs) must be reseeded so that the two processes will each yield different random data as they continue to execute. In addition, any inherited file descriptors that are not needed should be closed; they remain open in the other process because the new process only has a copy of them.
Finally, if the original process had extra privileges from being executed as setuid or setgid, those privileges will be inherited by the new process, and they should be dropped immediately if they are not needed. In particular, if the new process is going to be used to execute a new program, privileges should always be dropped so that the new program does not inherit privileges that it should not have.
When fork( ) is used to create a new process, the new process is a nearly identical copy of the original process. The only differences in the processes are the process ID, the parent process ID, and the resource utilization counters, which are reset to zero in the new process. Execution in both processes continues immediately after the return from fork( ). Each process can determine whether it is the parent or the child by checking the return value from
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Executing External Programs Securely
Your Unix program needs to execute another program.
On Unix, one of the exec*( ) family of functions is used to replace the current program within a process with another program. Typically, when you're executing another program, the original program continues to run while the new program is executed, thus requiring two processes to achieve the desired effect. The exec*( ) functions do not create a new process. Instead, you must first use fork( ) to create a new process, and then use one of the exec*( ) functions in the new process to run the new program. See Recipe 1.6 for a discussion of using fork( ) securely.
execve( ) is the system call used to load and begin execution of a new program. The other functions in the exec*( ) family are wrappers around the execve( ) system call, and they are implemented in user space in the standard C runtime library. When a new program is loaded and executed with execve( ), the new program replaces the old program within the same process. As part of the process of loading the new program, the old program's address space is replaced with a new address space. File descriptors that are marked to close on execute are closed; the new program inherits all others. All other system-level properties are tied to the process, so the new program inherits them from the old program. Such properties include the process ID, user IDs, group IDs, working and root directories, and signal mask.
Table 1-2 lists the various exec*( ) wrappers around the execve( ) system call. Note that many of these wrappers should not be used in secure code. In particular, never use the wrappers that are named with a "p" suffix because they will search the environment to locate the file to be executed. When executing external programs, you should always specify the full path to the file that you want to execute. If the PATH environment variable is used to locate the file, the file that is found to execute may not be the expected one.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Executing External Programs Securely
Your Windows program needs to execute another program.
On Windows, use the CreateProcess( ) API function to load and execute a new program. Alternatively, use the CreateProcessAsUser( ) API function to load and execute a new program with a primary access token other than the one in use by the current program.
The Win32 API provides several functions for executing new programs. In the days of the Win16 API, the proper way to execute a new program was to call WinExec( ) . While this function still exists in the Win32 API as a wrapper around CreateProcess( ) for compatibility reasons, its use is deprecated, and new programs should call CreateProcess( ) directly instead.
A powerful but extremely dangerous API function that is popular among developers is ShellExecute( ) . This function is implemented as a wrapper around CreateProcess( ), and it does exactly what we're about to advise against doing with CreateProcess( )—but we're getting a bit ahead of ourselves.
One of the reasons ShellExecute( ) is so popular is that virtually anything can be executed with the API. If the file to execute as passed to ShellExecute( ) is not actually executable, the API will search the registry looking for the right application to launch the file. For example, if you pass it a filename with a .TXT extension, the filename will probably start Notepad with the specified file loaded. While this can be an incredibly handy feature, it's also a disaster waiting to happen. Users can configure their own file associations, and there is no guarantee that you'll get the expected behavior when you execute a program this way. Another problem is that because users can configure their own file associations, an attacker can do so as well, causing your program to end up doing something completely unexpected and potentially disastrous.
The safest way to execute a new program is to use either
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Disabling Memory Dumps in the Event of a Crash
Your application stores potentially sensitive data in memory, and you want to prevent this data from being written to disk if the program crashes, because local attackers might be able to examine a core dump and use that information nefariously.
On Unix systems, use setrlimit( ) to set the RLIMIT_CORE resource to zero, which will prevent the operating system from leaving behind a core file. On Windows, it is not possible to disable such behavior, but there is equally no guarantee that a memory dump will be performed. A system-wide setting that cannot be altered on a per-application basis controls what action Windows takes when an application crashes.
A Windows feature called Dr. Watson, which is enabled by default, may cause the contents of a process's address space to be written to disk in the event of a crash. If Microsoft Visual Studio is installed, the settings that normally cause Dr. Watson to run are changed to run the Microsoft Visual Studio debugger instead, and no dump will be generated. Other programs do similar things, so from system to system, there's no telling what might happen if an application crashes.
Unfortunately, there is no way to prevent memory dumps on a per-application basis on Windows. The settings for how to handle an application crash are system-wide, stored in the registry under HKEY_LOCAL_MACHINE, and they require Administrator access to change them. Even if you're reasonably certain Dr. Watson will be the handler on systems on which your program will be running, there is no way you can disable its functionality on a per-application basis. On the other hand, any dump that may be created by Dr. Watson is properly protected by ACLs that prevent any other user from accessing them.
On most Unix systems, a program that crashes will " dump core." The action of dumping core causes an image of the program's committed memory at the time of the crash to be written out to a file on disk, which can later be used for post-mortem debugging.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 2: Access Control
Access control is a major issue for application developers. An application must always be sure to protect its resources from unauthorized access. This requires properly setting permissions on created files, allowing only authorized hosts to connect to any network ports, and properly handling privilege elevation and surrendering. Applications must also defend against race conditions that may occur when opening files—for example, the Time of Check, Time of Use (TOCTOU) condition. The proper approach to access control is a consistent, careful use of all APIs that access external resources. You must minimize the time a program runs with privileges and perform only the bare minimum of operations at a privileged level. When sensitive data is involved, it is your application's duty to protect the user's data from unauthorized access; keep this in mind during all stages of development.
You want to understand how access control works on Unix systems.
Unix traditionally uses a user ID-based access control system. Some newer variants implement additional access control mechanisms, such as Linux's implementation of POSIX capabilities. Because additional access control mechanisms vary greatly from system to system, we will discuss only the basic user ID system in this recipe.
Every process running on a Unix system has a user ID assigned to it. In reality, every process actually has three user IDs assigned to it: an effective user ID, a real user ID, and a saved user ID. The effective user ID is the user ID used for most permission checks. The real user and saved user IDs are used primarily for determining whether a process can legally change its effective user ID (see Recipe 1.3).
In addition to user IDs, each process also has a group ID. As with user IDs, there are actually three group IDs: an effective group ID, a real group ID, and a saved group ID. Processes may belong to more than a single group. The operating system maintains a list of groups to which a process belongs for each process. Group-based permission checks check the effective group ID as well as the process's group list.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Understanding the Unix Access Control Model
You want to understand how access control works on Unix systems.
Unix traditionally uses a user ID-based access control system. Some newer variants implement additional access control mechanisms, such as Linux's implementation of POSIX capabilities. Because additional access control mechanisms vary greatly from system to system, we will discuss only the basic user ID system in this recipe.
Every process running on a Unix system has a user ID assigned to it. In reality, every process actually has three user IDs assigned to it: an effective user ID, a real user ID, and a saved user ID. The effective user ID is the user ID used for most permission checks. The real user and saved user IDs are used primarily for determining whether a process can legally change its effective user ID (see Recipe 1.3).
In addition to user IDs, each process also has a group ID. As with user IDs, there are actually three group IDs: an effective group ID, a real group ID, and a saved group ID. Processes may belong to more than a single group. The operating system maintains a list of groups to which a process belongs for each process. Group-based permission checks check the effective group ID as well as the process's group list.
The operating system performs a series of tests to determine whether a process has permission to access a particular file on the filesystem or some other resource (such as a semaphore or shared memory segment). By far, the most common permission check performed is for file access.
When a process creates a file or some other resource, the operating system assigns a user ID and a group ID as the owner of the file or resource. The user ID is assigned the process's effective user ID, and the group ID is assigned the process's effective group ID.
To define the accessibility of a file or resource, each file or resource has three sets of three permission bits assigned to it. For the owning user, the owning group, and everyone else (often referred to as "world" or "other"), read, write, and execute permissions are stored.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Understanding the Windows Access Control Model
You want to understand how access control works on Windows systems.
Versions of Windows before Windows NT have no access control whatsoever. Windows 95, Windows 98, and Windows ME are all intended to be single-user desktop operating systems and thus have no need for access control. Windows NT, Windows 2000, Windows XP, and Windows Server 2003 all use a system of access control lists (ACLs).
Most users do not understand the Windows access control model and generally regard it as being overly complex. However, it is actually rather straightforward and easy to understand. Unfortunately, from a programmer's perspective, the API for dealing with ACLs is not so easy to deal with.
In Section 2.2.3, we describe the Windows access control model from a high level. We do not provide examples of using the API here, but other recipes throughout the book do provide such examples.
All Windows resources, including files, the registry, synchronization primitives (e.g., mutexes and events), and IPC mechanisms (e.g., pipes and mailslots), are accessed through objects, which may be secured using ACLs. Every ACL contains a discretionary access control list (DACL) and a system access control list (SACL). DACLs determine access rights to an object, and SACLs determine auditing (e.g., logging) policy. In this recipe, we are concerned only with access rights, so we will discuss only DACLs.
A DACL contains zero or more access control entries (ACEs). A DACL with no ACEs, said to be a NULL DACL , is essentially the equivalent of granting full access to everyone, which is never a good idea. A NULL DACL means anyone can do anything to the object. Not only does full access imply the ability to read from or write to the object, it also implies the ability to take ownership of the object or modify its DACL. In the hands of an attacker, the ability to take ownership of the object and modify its DACL can result in denial of service attacks because the object should be accessible but no longer is.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Determining Whether a User Has Access to a File on Unix
Your program is running with extra permissions because its executable has the setuid or setgid bit set. You need to determine whether the user running the program will be able to access a file without the extra privileges granted by setuid or setgid.
Temporarily drop privileges to the user and group for which access is to be checked. With the process's privileges lowered, perform the access check, then restore privileges to what they were before the check. See Recipe 1.3 for additional discussion of elevated privileges and how to drop and restore them.
It is always best to allow the operating system to do the bulk of the work of performing access checks. The only way to do so is to manipulate the privileges under which the process is running. Recipe 1.3 provides implementations for functions that temporarily drop privileges and then restore them again.
When performing access checks on files, you need to be careful to avoid the types of race conditions known as Time of Check, Time of Use (TOCTOU), which are illustrated in Figure 2-1 and Figure 2-2. These race conditions occur when access is checked before opening a file. The most common way for this to occur is to use the access( ) system call to verify access to a file, and then to use open( ) or fopen( ) to open the file if the return from access( ) indicates that access will be granted.
The problem is that between the time the access check via access( ) completes and the time open( ) begins (both system calls are atomic within the operating system kernel), there is a window of vulnerability where an attacker can replace the file that is being operated upon. Let's say that a program uses access( ) to check to see whether an attacker has write permissions to a particular file, as shown in Figure 2-1. If that file is a symbolic link, access( ) will follow it, and report that the attacker does indeed have write permissions for the underlying file. If the attacker can change the symbolic link after the check occurs, but before the program starts using the file, pointing it to a file he couldn't otherwise access, the privileged program will end up opening a file that it shouldn't, as shown in Figure 2-2. The problem is that the program can manipulate either file, and it gets tricked into opening one on behalf of the user that it shouldn't have.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Determining Whether a Directory Is Secure
Your application needs to store sensitive information on disk, and you want to ensure that the directory used cannot be modified by any other entity on the system besides the current user and the administrator. That is, you would like a directory where you can modify the contents at will, without having to worry about future permission checks.
Check the entire directory tree above the one you intend to use for unsafe permissions. Specifically, you are looking for the ability for users other than the owner and the superuser (the Administrator account on Windows) to modify the directory. On Windows, the required directory traversal cannot be done without introducing race conditions and a significant amount of complex path processing. The best advice we can offer, therefore, is to consider home directories (typically x:\Documents and Settings\User, where x is the boot drive and User is the user's account name) the safest directories. Never consider using temporary directories to store files that may contain sensitive data.
Storing sensitive data in files requires extra levels of protection to ensure that the data is not compromised. An often overlooked aspect of protection is ensuring that the directories that contain files (which, in turn, contain sensitive data) are safe from modification.
This may appear to be a simple matter of ensuring that the directory is protected against any other users writing to it, but that is not enough. All the directories in the path must also be protected against any other users writing to them. This means that the same user who will own the file containing the sensitive data also owns the directories, and that the directories are all protected against other users modifying them.
The reason for this is that when a directory is writable by a particular user, that user is able to rename directories and files that reside within that directory. For example, suppose that you want to store sensitive data in a file that will be placed into the directory
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Erasing Files Securely
You want to erase a file securely, preventing recovery of any data via "undelete" tools or any inspection of the disk for data that has been left behind.
Write over the data in the file multiple times, varying the data written each time. You should write both random and patterned data for maximum effectiveness.
It is extremely difficult, if not outright impossible, to guarantee that the contents of a file are completely unrecoverable on modern operating systems that offer logging filesystems, virtual memory, and other such features.
Securely deleting files from disk is not as simple as issuing a system call to delete the file from the filesystem. The first problem is that most delete operations do not do anything to the data; they merely delete any underlying metadata that the filesystem uses to associate the file contents with the filename. The storage space where the actual data is stored is then marked free and will be reclaimed whenever the filesystem needs that space.
The result is that to truly erase the data, you need to overwrite it with nonsense before the filesystem delete operation is performed. Many times, this overwriting is implemented by simply zeroing all the bytes in the file. While this will certainly erase the file from the perspective of most conventional utilities, the fact that most data is stored on magnetic media makes this more complicated.
More sophisticated tools can analyze the actual media and reveal the data that was previously stored on it. This type of data recovery has a limit, however. If the data is sufficiently overwritten on the media, it does become unrecoverable, masked by the new data that has overwritten it. A variety of factors, such as the type of data written and the characteristics of the media, determine the point at which the interesting data becomes unrecoverable.
A technique developed by Peter Gutmann provides an algorithm involving multiple passes of data written to the disk to delete a file securely. The passes involve both specific patterns and random data written to the disk. The paper detailing this technique is available from
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Accessing File Information Securely
Y ou need to access information about a file, such as its size or last modification date. In doing so, you want to avoid the possibility of race conditions.
Use a secure directory, as described in Recipe 2.4. Alternatively, open the file and query the needed information using the file handle. Do not use functions that operate on the name of the file, especially if multiple queries are required for the same file or if you intend to open it based on the information obtained from queries. Operating on filenames introduces the possibility of race conditions because filenames can change between calls.
On Unix, use the fstat( ) function instead of the stat( ) function. Both functions return the same information, but fstat( ) uses an open file descriptor while stat( ) uses a filename. Doing so removes the possibility of a race condition, because the file to which the file descriptor points can never change unless you reopen the file descriptor. When operating on just the filename, there is no guarantee that the underlying file pointed to by the filename remains the same after the call to stat( ).
On Windows, use the function GetFileInformationByHandle( ) instead of functions like FindFirstFile( ) or FindFirstFileEx( ). As with fstat( ) versus stat( ) on Unix (which are also available on Windows if you're using the C runtime API), the primary difference between these functions is that one uses a file handle while the others use filenames. If the only information you need is the size of the file, you can use GetFileSize( ) instead of GetFileInformationByHandle( ).
Accessing file information using filenames can lead to race conditions, particularly if multiple queries are necessary or if you intend to open the file depending on information previously obtained. In particular, if symbolic links are involved, an attacker could potentially change the file to which the link points between queries or between the time information is queried and the time the file is actually opened. This type of race condition, known as a Time of Check, Time of Use (TOCTOU) race condition, was also discussed in Recipe 2.3.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Restricting Access Permissions for New Files on Unix
You want to restrict the initial access permissions assigned to a file created by your program.
On Unix, the operating system stores a value known as the umask for each process it uses when creating new files on behalf of the process. The umask is used to disable permission bits that may be specified by the system call used to create files.
Remember that umasks apply only on file or directory creation. Calls to chmod( ) and fchmod( ) are not modified by umask settings.
When a process creates a new file, it specifies the access permissions to assign the new file as a parameter to the system call that creates the file. The operating system modifies the access permissions by computing the intersection of the inverse of the umask and the permissions requested by the process. The access permission bits that remain after the intersection is computed are what the operating system actually uses for the new file. In other words, in the following example code, if the variable requested_permissions contained the permissions passed to the operating system to create a new file, the variable actual_permissions would be the actual permissions that the operating system would use to create the file.
requested_permissions = 0666;
actual_permissions = requested_permissions & ~umask(  );
A process inherits the value of its umask from its parent process when the process is created. Normally, the shell sets a default umask of either 022 (disable group- and world-writable bits) or 02 (disable world-writable bits) when a user logs in, but users have free reign to change the umask as they want. Many users are not even aware of the existence of umasks, never mind how to set them appropriately. Therefore, the umask value as set by the user should never be trusted to be appropriate.
When using the open( ) system call to create a new file, you can force more restrictive permissions to be used than what the user's umask might allow, but the only way to create a file with less restrictive permissions is either to modify the umask before creating the file or to use
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Locking Files
You want to lock files (or portions of them) to prevent two or more processes from accessing them simultaneously.
Two basic types of locks exist: advisory and mandatory. Unix supports both advisory and, to an extremely limited extent, mandatory locks, while Windows supports only mandatory locks.
In the following sections, we will look at the different issues for Unix and Windows.

Section 2.8.3.1: Locking files on Unix

All modern Unix variants support advisory locks. An advisory lock is a lock in which the operating system does not enforce the lock. Instead, programs sharing the same file must cooperate with each other to ensure that locks are properly observed. From a security perspective, advisory locks are of little use because any program is free to perform any action on a file regardless of the state of any advisory locks that other programs may hold on the file.
Support for mandatory locks varies greatly from one Unix variant to another. Both Linux and Solaris support mandatory locks, but Darwin, FreeBSD, NetBSD, and OpenBSD do not, even though they export the interface used by Linux and Solaris to support them. On such systems, this interface creates advisory locks.
Support for mandatory locking does not extend to NFS. In other words, both Linux and Solaris are capable only of using mandatory locks on local filesystems. Further, Linux requires that filesystems be mounted with support for mandatory locking, which is disabled by default. In the end, Solaris is really the only Unix variant on which you can reasonably expect mandatory locking to work, and even then, relying on mandatory locks is like playing with fire.
As if the story for mandatory locking on Unix were not bad enough already, it gets worse. To be able to use mandatory locks on a file, the file must have the setgid bit enabled and the group execute bit disabled in its permissions. Even if a process holds a mandatory lock on a file, another process may remove the setgid bit from the file's permissions, which effectively turns the mandatory lock into an advisory lock!
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Synchronizing Resource Access Across Processes on Unix
You want to ensure that two processes cannot simultaneously access the same resource, such as a segment of shared memory.
Use a lock file to signal that you are accessing the resource.
Using a lock file to synchronize access to shared resources is not as simple as it sounds. Suppose that your program creates a lock file and then crashes. If this happens, the lock file will remain, and your program (as well as any other program that attempted to obtain the lock) will fail until someone manually removes the lock file. Obviously, this is undesirable. The solution is to store the process ID of the process holding the lock in the lock file. Other processes attempting to obtain the lock can then test to see whether the process holding the lock still exists. If it does not, the lock file is stale, it is safe to remove, and you can make another attempt to obtain the lock.
Unfortunately, this solution is still not a perfect one. What happens if another process is assigned the same ID as the one stored in the stale lock file? The answer to this question is simply that no process can obtain the lock until the process with the stale ID terminates or someone manually removes the lock file. Fortunately, this case should not be encountered frequently.
As a result of solving the stale lock problem, a new problem arises: there is now a race condition between the time the check for the existence of the process holding the lock is performed and the time the lock file is removed. The solution to this problem is to attempt to reopen the lock file after writing the new one to make sure that the process ID in the lock file is the same as the locking process's ID. If it is, the lock is successfully obtained.
The function presented below, spc_lock_file( ) , requires a single argument: the name of the file to be used as the lock file. You must store the lock file in a "safe" directory (see Recipe 2.4) on a local filesystem. Network filesystems—versions of NFS older than Version 3 in particular—may not necessarily support the
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Synchronizing Resource Access Across Processes on Windows
You want to ensure that two processes cannot simultaneously access the same resource.
Use a named mutex (mutually exclusive lock) to synchronize access to the resource.
Coordinating access to a shared resource between multiple processes on Windows is much simpler and much more elegant than it is on Unix. For maximum portability on Unix, you must use a lock file and make sure to avoid a number of possible race conditions to make lock files work properly. On Windows, however, the use of named mutexes solves all the problems Unix has without introducing new ones.
A named mutex is a synchronization object that works by allowing only a single thread to acquire a lock at any given time. Mutexes can also exist without a name, in which case they are considered anonymous. Access to an anonymous mutex can only be obtained by somehow acquiring a handle to the object from the thread that created it. Anonymous mutexes are of no use to us in this recipe, so we won't discuss them further.
Mutexes have a namespace much like that of a filesystem. The mutex namespace is separate from namespaces used by all other objects. If two or more applications agree on a name for a mutex, access to the mutex can always be obtained to use it for synchronizing access to a shared resource.
A mutex is created with a call to the CreateMutex( ) function. You will find it particularly useful in this recipe that the mutex is created and a handle returned, or, if the mutex already exists, a handle to the existing mutex is returned.
Once we have a handle to the mutex that will be used for synchronization, using it is a simple matter of waiting for the mutex to enter the signaled state. When it does, we obtain the lock, and other processes wait for us to release it. When we are finished using the resource, we simply release the lock, which places the mutex into the signaled state.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Creating Files for Temporary Use
You need to create a file to use as scratch space that may contain sensitive data.
Generate a random filename and attempt to create the file, failing if the file already exists. If the file cannot be created because it already exists, repeat the process until it succeeds. If creating the file fails for any other reason, abort the process.
When creating temporary files, you should consider using a known-safe directory to store them, as described in Recipe 2.4.
The need for temporary files is common. More often than not, other processes have no need to access the temporary files you create, and especially if the files contain sensitive data, it is best to do everything possible to ensure that other processes cannot access them. It is also important that temporary files do not remain on the filesystem any longer than necessary. If the program creating temporary files terminates unexpectedly before it cleans up the files, temporary directories often become littered with files of no interest or value to anyone or anything. Worse, if the temporary files contain sensitive data, they are suddenly both interesting and valuable to an attacker.

Section 2.11.3.1: Temporary files on Unix

The best solution for creating a temporary file on Unix is to use the mkstemp( ) function in the standard C runtime library. This function generates a random filename, attempts to create it, and repeats the whole process until it is successful, thus guaranteeing that a unique file is created. The file created by mkstemp( ) will be readable and writable by the owner, but not by anyone else.
To help further ensure that the file cannot be accessed by any other process, and to be sure that the file will not be left behind by your program if it should terminate unexpectedly before being able to delete it, the file can be deleted by name while it is open immediately after
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Restricting Filesystem Access on Unix
You want to restrict your program's ability to access important parts of the filesystem.
Unix systems provide a system call known as chroot( ) that will restrict the process's access to the filesystem. Specifically, chroot( ) alters a process's perception of the filesystem by changing its root directory, which effectively prevents the process from accessing any part of the filesystem above the new root directory.
Normally, a process's root directory is the actual system root directory, which allows the process to access any part of the filesystem. However, by using the chroot( ) system call, a process can alter its view of the filesystem by changing its root directory to another directory within the filesystem. Once the process's root directory has been changed once, it can only be made more restrictive. It is not possible to change the process's root directory to another directory outside of its current view of the filesystem.
Using chroot( ) is a simple way to increase security for processes that do not require access to the filesystem outside of a directory or hierarchy of directories containing its data files. If an attacker is somehow able to compromise the program and gain access to the filesystem, the potential for damage (whether it is reading sensitive data or destroying data) is localized to the restricted directory hierarchy imposed by altering the process's root directory.
Unfortunately, one often overlooked caveat applies to using chroot( ). The first time that chroot( ) is called, it does not necessarily alter the process's current directory, which means that until the current directory is forcibly changed, it may still be possible to access areas of the filesystem outside the new root directory structure. It is therefore imperative that the process calling chroot( ) immediately change its current directory to a directory within the new root directory structure. This is easily accomplished as follows:
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Restricting Filesystem and Network Access on FreeBSD
Your program runs primarily (if not exclusively) on FreeBSD, and you want to impose restrictions on your program's filesystem and network capabilities that are above and beyond what chroot( ) can do. (See Recipe 2.12.)
FreeBSD implements a system call known as jail( ) , which will "imprison" a process and its descendants. It does all that chroot( ) does and more.
Ordinarily, a jail is constructed on FreeBSD by the system administrator using the jail program, which is essentially a wrapper around the jail( ) system call. (Discounting comments and blank lines, the code is a mere 35 lines.) However, it is possible to use the jail( ) system call in your own programs.
The FreeBSD jail does everything that chroot( ) does, and then some. It restricts much of the superuser's normal abilities, and it restricts the IP address that programs running inside the jail may use.
Creating a jail is as simple as filling in a data structure with the appropriate information and calling jail( ). The same caveats that apply to chroot( ) also apply to jail( ) because jail( ) calls chroot( ) internally. In particular, only the superuser may create a jail successfully.
Presently, the jail configuration structure contains only four fields: version, path, hostname, and ip_number. The version field must be set to 0, and the path field is treated the same as chroot( )'s argument is. The hostname field sets the hostname of the jail; however, it is possible to change it from within the jail.
The ip_number field is the IP address to which processes running within the jail are restricted. Processes within the jail will only be able to bind to this address regardless of what other IP addresses are assigned to the system. In addition, all IP traffic emanating from processes within the jail will be forced to use this address as its source.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 3: Input Validation
Eavesdropping attacks are often easy to launch, but most people don't worry about them in their applications. Instead, they tend to worry about what malicious things can be done on the machine on which the application is running. Most people are far more worried about active attacks than they about passive attacks.
Pretty much every active attack out there is the result of some kind of input from an attacker. Secure programming is largely about making sure that inputs from bad people do not do bad things. Indeed, most of this book addresses how to deal with malicious inputs. For example, cryptography and a strong authentication protocol can help prevent attackers from capturing someone else's login credentials and sending those credentials as input to the program.
If this entire book focuses primarily on preventing malicious inputs, why do we have a chapter specifically devoted to this topic? It's because this chapter is about one important class of defensive techniques: input validation.
In this chapter, we assume that people are connected to our software, and that some of them may send malicious data (even if we think there is a trusted client on the other end). One question we really care about is this: "What does our application do with that data?" In particular, does the program take data that should be untrusted and do something potentially security-critical with it? More importantly, can any untrusted data be used to manipulate the application or the underlying system in a way that has security implications?
You have data coming into your application, and you would like to filter or reject data that might be malicious.
Perform data validation at all levels whenever possible. At the very least, make sure data is filtered on input.
Match constructs that are known to be valid and harmless. Reject anything else.
In addition, be sure to be skeptical about any data coming from a potentially insecure channel. In a client-server architecture, for example, even if you wrote the client, the server should never assume it is talking to a trusted client.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Understanding Basic Data Validation Techniques
You have data coming into your application, and you would like to filter or reject data that might be malicious.
Perform data validation at all levels whenever possible. At the very least, make sure data is filtered on input.
Match constructs that are known to be valid and harmless. Reject anything else.
In addition, be sure to be skeptical about any data coming from a potentially insecure channel. In a client-server architecture, for example, even if you wrote the client, the server should never assume it is talking to a trusted client.
Applications should not trust any external input. We have often seen situations in which people had a custom client-server application and the application developer assumed that, because the client was written in house by trusted, strong coders, there was nothing to worry about in terms of malicious data being injected.