Read it Now!
Reprint Licensing

Linux Device Drivers
Linux Device Drivers, Second Edition

By Alessandro Rubini, Jonathan Corbet

Cover | Table of Contents | Colophon


Table of Contents

Chapter 1: An Introduction to Device Drivers
As the popularity of the Linux system continues to grow, the interest in writing Linux device drivers steadily increases. Most of Linux is independent of the hardware it runs on, and most users can be (happily) unaware of hardware issues. But, for each piece of hardware supported by Linux, somebody somewhere has written a driver to make it work with the system. Without device drivers, there is no functioning system.
Device drivers take on a special role in the Linux kernel. They are distinct "black boxes" that make a particular piece of hardware respond to a well-defined internal programming interface; they hide completely the details of how the device works. User activities are performed by means of a set of standardized calls that are independent of the specific driver; mapping those calls to device-specific operations that act on real hardware is then the role of the device driver. This programming interface is such that drivers can be built separately from the rest of the kernel, and "plugged in" at runtime when needed. This modularity makes Linux drivers easy to write, to the point that there are now hundreds of them available.
There are a number of reasons to be interested in the writing of Linux device drivers. The rate at which new hardware becomes available (and obsolete!) alone guarantees that driver writers will be busy for the foreseeable future. Individuals may need to know about drivers in order to gain access to a particular device that is of interest to them. Hardware vendors, by making a Linux driver available for their products, can add the large and growing Linux user base to their potential markets. And the open source nature of the Linux system means that if the driver writer wishes, the source to a driver can be quickly disseminated to millions of users.
This book will teach you how to write your own drivers and how to hack around in related parts of the kernel. We have taken a device-independent approach; the programming techniques and interfaces are presented, whenever possible, without being tied to any specific device. Each driver is different; as a driver writer, you will need to understand your specific device well. But most of the principles and basic techniques are the same for all drivers. This book cannot teach you about your device, but it will give you a handle on the background you need to make your device work.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
The Role of the Device Driver
As a programmer, you will be able to make your own choices about your driver, choosing an acceptable trade-off between the programming time required and the flexibility of the result. Though it may appear strange to say that a driver is "flexible," we like this word because it emphasizes that the role of a device driver is providing mechanism, not policy.
The distinction between mechanism and policy is one of the best ideas behind the Unix design. Most programming problems can indeed be split into two parts: "what capabilities are to be provided" (the mechanism) and "how those capabilities can be used" (the policy). If the two issues are addressed by different parts of the program, or even by different programs altogether, the software package is much easier to develop and to adapt to particular needs.
For example, Unix management of the graphic display is split between the X server, which knows the hardware and offers a unified interface to user programs, and the window and session managers, which implement a particular policy without knowing anything about the hardware. People can use the same window manager on different hardware, and different users can run different configurations on the same workstation. Even completely different desktop environments, such as KDE and GNOME, can coexist on the same system. Another example is the layered structure of TCP/IP networking: the operating system offers the socket abstraction, which implements no policy regarding the data to be transferred, while different servers are in charge of the services (and their associated policies). Moreover, a server like ftpd provides the file transfer mechanism, while users can use whatever client they prefer; both command-line and graphic clients exist, and anyone can write a new user interface to transfer files.
Where drivers are concerned, the same separation of mechanism and policy applies. The floppy driver is policy free—its role is only to show the diskette as a continuous array of data blocks. Higher levels of the system provide policies, such as who may access the floppy drive, whether the drive is accessed directly or via a filesystem, and whether users may mount filesystems on the drive. Since different environments usually need to use hardware in different ways, it's important to be as policy free as possible.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Splitting the Kernel
In a Unix system, several concurrent processes attend to different tasks. Each process asks for system resources, be it computing power, memory, network connectivity, or some other resource. The kernel is the big chunk of executable code in charge of handling all such requests. Though the distinction between the different kernel tasks isn't always clearly marked, the kernel's role can be split, as shown in Figure 1-1, into the following parts:
Figure 1-1: A split view of the kernel
Process management
The kernel is in charge of creating and destroying processes and handling their connection to the outside world (input and output). Communication among different processes (through signals, pipes, or interprocess communication primitives) is basic to the overall system functionality and is also handled by the kernel. In addition, the scheduler, which controls how processes share the CPU, is part of process management. More generally, the kernel's process management activity implements the abstraction of several processes on top of a single CPU or a few of them.
Memory management
The computer's memory is a major resource, and the policy used to deal with it is a critical one for system performance. The kernel builds up a virtual addressing space for any and all processes on top of the limited available resources. The different parts of the kernel interact with the memory-management subsystem through a set of function calls, ranging from the simple malloc/free pair to much more exotic functionalities.
Filesystems
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Classes of Devices and Modules
The Unix way of looking at devices distinguishes between three device types. Each module usually implements one of these types, and thus is classifiable as a char module, a block module, or a network module. This division of modules into different types, or classes, is not a rigid one; the programmer can choose to build huge modules implementing different drivers in a single chunk of code. Good programmers, nonetheless, usually create a different module for each new functionality they implement, because decomposition is a key element of scalability and extendability.
The three classes are the following:
Character devices
A character (char) device is one that can be accessed as a stream of bytes (like a file); a char driver is in charge of implementing this behavior. Such a driver usually implements at least the open, close, read, and write system calls. The text console (/dev/console) and the serial ports (/dev/ttyS0 and friends) are examples of char devices, as they are well represented by the stream abstraction. Char devices are accessed by means of filesystem nodes, such as /dev/tty1 and /dev/lp0. The only relevant difference between a char device and a regular file is that you can always move back and forth in the regular file, whereas most char devices are just data channels, which you can only access sequentially. There exist, nonetheless, char devices that look like data areas, and you can move back and forth in them; for instance, this usually applies to frame grabbers, where the applications can access the whole acquired image using mmap or lseek.
Block devices
Like char devices, block devices are accessed by filesystem nodes in the
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Security Issues
Security is an increasingly important concern in modern times. We will discuss security-related issues as they come up throughout the book. There are a few general concepts, however, that are worth mentioning now.
Security has two faces, which can be called deliberate and incidental. One security problem is the damage a user can cause through the misuse of existing programs, or by incidentally exploiting bugs; a different issue is what kind of (mis)functionality a programmer can deliberately implement. The programmer has, obviously, much more power than a plain user. In other words, it's as dangerous to run a program you got from somebody else from the root account as it is to give him or her a root shell now and then. Although having access to a compiler is not a security hole per se, the hole can appear when compiled code is actually executed; everyone should be careful with modules, because a kernel module can do anything. A module is just as powerful as a superuser shell.
Any security check in the system is enforced by kernel code. If the kernel has security holes, then the system has holes. In the official kernel distribution, only an authorized user can load modules; the system call create_module checks if the invoking process is authorized to load a module into the kernel. Thus, when running an official kernel, only the superuser, or an intruder who has succeeded in becoming privileged, can exploit the power of privileged code.
When possible, driver writers should avoid encoding security policy in their code. Security is a policy issue that is often best handled at higher levels within the kernel, under the control of the system administrator. There are always exceptions, however. As a device driver writer, you should be aware of situations in which some types of device access could adversely affect the system as a whole, and should provide adequate controls. For example, device operations that affect global resources (such as setting an interrupt line) or that could affect other users (such as setting a default block size on a tape drive) are usually only available to sufficiently privileged users, and this check must be made in the driver itself.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Version Numbering
Before digging into programming, we'd like to comment on the version numbering scheme used in Linux and which versions are covered by this book.
First of all, note that every software package used in a Linux system has its own release number, and there are often interdependencies across them: you need a particular version of one package to run a particular version of another package. The creators of Linux distributions usually handle the messy problem of matching packages, and the user who installs from a prepackaged distribution doesn't need to deal with version numbers. Those who replace and upgrade system software, on the other hand, are on their own. Fortunately, almost all modern distributions support the upgrade of single packages by checking interpackage dependencies; the distribution's package manager generally will not allow an upgrade until the dependencies are satisfied.
To run the examples we introduce during the discussion, you won't need particular versions of any tool but the kernel; any recent Linux distribution can be used to run our examples. We won't detail specific requirements, because the file Documentation/Changes in your kernel sources is the best source of such information if you experience any problem.
As far as the kernel is concerned, the even-numbered kernel versions (i.e., 2.2.x and 2.4.x) are the stable ones that are intended for general distribution. The odd versions (such as 2.3.x), on the contrary, are development snapshots and are quite ephemeral; the latest of them represents the current status of development, but becomes obsolete in a few days or so.
This book covers versions 2.0 through 2.4 of the kernel. Our focus has been to show all the features available to device driver writers in 2.4, the current version at the time we are writing. We also try to cover 2.2 thoroughly, in those areas where the features differ between 2.2 and 2.4. We also note features that are not available in 2.0, and offer workarounds where space permits. In general, the code we show is designed to compile and run on a wide range of kernel versions; in particular, it has all been tested with version 2.4.4, and, where applicable, with 2.2.18 and 2.0.38 as well.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
License Terms
Linux is licensed with the GNU General Public License (GPL), a document devised for the GNU project by the Free Software Foundation. The GPL allows anybody to redistribute, and even sell, a product covered by the GPL, as long as the recipient is allowed to rebuild an exact copy of the binary files from source. Additionally, any software product derived from a product covered by the GPL must, if it is redistributed at all, be released under the GPL.
The main goal of such a license is to allow the growth of knowledge by permitting everybody to modify programs at will; at the same time, people selling software to the public can still do their job. Despite this simple objective, there's a never-ending discussion about the GPL and its use. If you want to read the license, you can find it in several places in your system, including the directory /usr/src/linux, as a file called COPYING.
Third-party and custom modules are not part of the Linux kernel, and thus you're not forced to license them under the GPL. A module uses the kernel through a well-defined interface, but is not part of it, similar to the way user programs use the kernel through system calls. Note that the exemption to GPL licensing applies only to modules that use only the published module interface. Modules that dig deeper into the kernel must adhere to the "derived work" terms of the GPL.
In brief, if your code goes in the kernel, you must use the GPL as soon as you release the code. Although personal use of your changes doesn't force the GPL on you, if you distribute your code you must include the source code in the distribution—people acquiring your package must be allowed to rebuild the binary at will. If you write a module, on the other hand, you are allowed to distribute it in binary form. However, this is not always practical, as modules should in general be recompiled for each kernel version that they will be linked with (as explained in Chapter 2, in Section 2.2.1, and Chapter 11, in Section 11.3). New kernel releases—even minor stable releases—often break compiled modules, requiring a recompile. Linus Torvalds has stated publicly that he has no problem with this behavior, and that binary modules should be expected to work only with the kernel under which they were compiled. As a module writer, you will generally serve your users better by making source available.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Joining the Kernel Development Community
As you get into writing modules for the Linux kernel, you become part of a larger community of developers. Within that community, you can find not only people engaged in similar work, but also a group of highly committed engineers working toward making Linux a better system. These people can be a source of help, of ideas, and of critical review as well—they will be the first people you will likely turn to when you are looking for testers for a new driver.
The central gathering point for Linux kernel developers is the linux-kernel mailing list. All major kernel developers, from Linus Torvalds on down, subscribe to this list. Please note that the list is not for the faint of heart: traffic as of this writing can run up to 200 messages per day or more. Nonetheless, following this list is essential for those who are interested in kernel development; it also can be a top-quality resource for those in need of kernel development help.
To join the linux-kernel list, follow the instructions found in the linux-kernel mailing list FAQ: http://www.tux.org/lkml. Please read the rest of the FAQ while you are at it; there is a great deal of useful information there. Linux kernel developers are busy people, and they are much more inclined to help people who have clearly done their homework first.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Overview of the Book
From here on, we enter the world of kernel programming. Chapter 2 introduces modularization, explaining the secrets of the art and showing the code for running modules. Chapter 3 talks about char drivers and shows the complete code for a memory-based device driver that can be read and written for fun. Using memory as the hardware base for the device allows anyone to run the sample code without the need to acquire special hardware.
Debugging techniques are vital tools for the programmer and are introduced in Chapter 4. Then, with our new debugging skills, we move to advanced features of char drivers, such as blocking operations, the use of select, and the important ioctl call; these topics are the subject of Chapter 5.
Before dealing with hardware management, we dissect a few more of the kernel's software interfaces: Chapter 6 shows how time is managed in the kernel, and Chapter 7 explains memory allocation.
Next we focus on hardware. Chapter 8 describes the management of I/O ports and memory buffers that live on the device; after that comes interrupt handling, in Chapter 9. Unfortunately, not everyone will be able to run the sample code for these chapters, because some hardware support is actually needed to test the software interface to interrupts. We've tried our best to keep required hardware support to a minimum, but you still need to put your hands on the soldering iron to build your hardware "device." The device is a single jumper wire that plugs into the parallel port, so we hope this is not a problem.
Chapter 10 offers some additional suggestions about writing kernel software and about portability issues.
In the second part of this book, we get more ambitious; thus, Chapter 11 starts over with modularization issues, going deeper into the topic.
Chapter 12 then describes how block drivers are implemented, outlining the aspects that differentiate them from char drivers. Following that, Chapter 13 explains what we left out from the previous treatment of memory management:
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 2: Building and Running Modules
It's high time now to begin programming. This chapter introduces all the essential concepts about modules and kernel programming. In these few pages, we build and run a complete module. Developing such expertise is an essential foundation for any kind of modularized driver. To avoid throwing in too many concepts at once, this chapter talks only about modules, without referring to any specific device class.
All the kernel items (functions, variables, header files, and macros) that are introduced here are described in a reference section at the end of the chapter.
For the impatient reader, the following code is a complete "Hello, World" module (which does nothing in particular). This code will compile and run under Linux kernel versions 2.0 through 2.4.
 


#define MODULE
#include <linux/module.h>

int init_module(void)  { printk("<1>Hello, world\n"); return 0; }
void cleanup_module(void) { printk("<1>Goodbye cruel world\n"); }

The printk function is defined in the Linux kernel and behaves similarly to the standard C library function printf. The kernel needs its own printing function because it runs by itself, without the help of the C library. The module can call printk because, after insmod has loaded it, the module is linked to the kernel and can access the kernel's public symbols (functions and variables, as detailed in the next section). The string <1> is the priority of the message. We've specified a high priority (low cardinal number) in this module because a message with the default priority might not show on the console, depending on the kernel version you are running, the version of the klogd daemon, and your configuration. You can ignore this issue for now; we'll explain it in Section 4.1.1 in Chapter 4.
You can test the module by calling insmod and rmmod, as shown in the screen dump in the following paragraph. Note that only the superuser can load and unload a module.
The source file shown earlier can be loaded and unloaded as shown only if the running kernel has module version support disabled; however, most distributions preinstall versioned kernels (versioning is discussed in Section 11.3 in Chapter 11). Although older
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Kernel Modules Versus Applications
Before we go further, it's worth underlining the various differences between a kernel module and an application.
Whereas an application performs a single task from beginning to end, a module registers itself in order to serve future requests, and its "main" function terminates immediately. In other words, the task of the function init_module (the module's entry point) is to prepare for later invocation of the module's functions; it's as though the module were saying, "Here I am, and this is what I can do." The second entry point of a module, cleanup_module, gets invoked just before the module is unloaded. It should tell the kernel, "I'm not there anymore; don't ask me to do anything else." The ability to unload a module is one of the features of modularization that you'll most appreciate, because it helps cut down development time; you can test successive versions of your new driver without going through the lengthy shutdown/reboot cycle each time.
As a programmer, you know that an application can call functions it doesn't define: the linking stage resolves external references using the appropriate library of functions. printf is one of those callable functions and is defined in libc. A module, on the other hand, is linked only to the kernel, and the only functions it can call are the ones exported by the kernel; there are no libraries to link to. The printk function used in hello.c earlier, for example, is the version of printf defined within the kernel and exported to modules. It behaves similarly to the original function, with a few minor differences, the main one being lack of floating-point support.
Figure 2-1 shows how function calls and function pointers are used in a module to add new functionality to a running kernel.
Figure 2-1: Linking a module to the kernel
Because no library is linked to modules, source files should
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Compiling and Loading
The rest of this chapter is devoted to writing a complete, though typeless, module. That is, the module will not belong to any of the classes listed in Section 1.3 in Chapter 1. The sample driver shown in this chapter is called skull, short for Simple Kernel Utility for Loading Localities. You can reuse the skull source to load your own local code to the kernel, after removing the sample functionality it offers.
Before we deal with the roles of init_module and cleanup_module, however, we'll write a makefile that builds object code that the kernel can load.
First, we need to define the __KERNEL__ symbol in the preprocessor before we include any headers. As mentioned earlier, much of the kernel-specific content in the kernel headers is unavailable without this symbol.
Another important symbol is MODULE, which must be defined before including <linux/module.h> (except for drivers that are linked directly into the kernel). This book does not cover directly linked modules; thus, the MODULE symbol is always defined in our examples.
If you are compiling for an SMP machine, you also need to define __SMP__ before including the kernel headers. In version 2.2, the "multiprocessor or uniprocessor" choice was promoted to a proper configuration item, so using these lines as the very first lines of your modules will do the task:
 #include <linux/config.h>
 #ifdef CONFIG_SMP
 # define __SMP__
 #endif
A module writer must also specify the -O flag to the compiler, because many functions are declared as inline in the header files. gcc doesn't expand inline functions unless optimization is enabled, but it can accept both the -g and -O options, allowing you to debug code that uses inline functions. Because the kernel makes extensive use of inline functions, it is important that they be expanded properly.
You may also need to check that the compiler you are running matches the kernel you are compiling against, referring to the file
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
The Kernel Symbol Table
We've seen how insmod resolves undefined symbols against the table of public kernel symbols. The table contains the addresses of global kernel items—functions and variables—that are needed to implement modularized drivers. The public symbol table can be read in text form from the file /proc/ksyms (assuming, of course, that your kernel has support for the /proc filesystem—which it really should).
When a module is loaded, any symbol exported by the module becomes part of the kernel symbol table, and you can see it appear in /proc/ksyms or in the output of the ksyms command.
New modules can use symbols exported by your module, and you can stack new modules on top of other modules. Module stacking is implemented in the mainstream kernel sources as well: the msdos filesystem relies on symbols exported by the fat module, and each input USB device module stacks on the usbcore and input modules.
Module stacking is useful in complex projects. If a new abstraction is implemented in the form of a device driver, it might offer a plug for hardware-specific implementations. For example, the video-for-linux set of drivers is split into a generic module that exports symbols used by lower-level device drivers for specific hardware. According to your setup, you load the generic video module and the specific module for your installed hardware. Support for parallel ports and the wide variety of attachable devices is handled in the same way, as is the USB kernel subsystem. Stacking in the parallel port subsystem is shown in Figure 2-2; the arrows show the communications between the modules (with some example functions and data structures) and with the kernel programming interface.
Figure 2-2: Stacking of parallel port driver modules
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Initialization and Shutdown
As already mentioned, init_module registers any facility offered by the module. By facility, we mean a new functionality, be it a whole driver or a new software abstraction, that can be accessed by an application.
Modules can register many different types of facilities; for each facility, there is a specific kernel function that accomplishes this registration. The arguments passed to the kernel registration functions are usually a pointer to a data structure describing the new facility and the name of the facility being registered. The data structure usually embeds pointers to module functions, which is how functions in the module body get called.
The items that can be registered exceed the list of device types mentioned in Chapter 1. They include serial ports, miscellaneous devices, /proc files, executable domains, and line disciplines. Many of those registrable items support functions that aren't directly related to hardware but remain in the "software abstractions" field. Those items can be registered because they are integrated into the driver's functionality anyway (like /proc files and line disciplines for example).
There are other facilities that can be registered as add-ons for certain drivers, but their use is so specific that it's not worth talking about them; they use the stacking technique, as described earlier in Section 2.3. If you want to probe further, you can grep for EXPORT_SYMBOL in the kernel sources and find the entry points offered by different drivers. Most registration functions are prefixed with register_, so another possible way to find them is to grep for register_ in /proc/ksyms.
If any errors occur when you register utilities, you must undo any registration activities performed before the failure. An error can happen, for example, if there isn't enough memory in the system to allocate a new data structure or because a resource being requested is already being used by other drivers. Though unlikely, it might happen, and good program code must be prepared to handle this event.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Using Resources
A module can't accomplish its task without using system resources such as memory, I/O ports, I/O memory, and interrupt lines, as well as DMA channels if you use old-fashioned DMA controllers like the Industry Standard Architecture (ISA) one.
As a programmer, you are already accustomed to managing memory allocation; writing kernel code is no different in this regard. Your program obtains a memory area using kmalloc and releases it using kfree. These functions behave like malloc and free, except that kmalloc takes an additional argument, the priority. Usually, a priority of GFP_KERNEL or GFP_USER will do. The GFP acronym stands for "get free page." (Memory allocation is covered in detail in Chapter 7.)
Beginning driver programmers may initially be surprised at the need to allocate I/O ports, I/O memory, and interrupt lines explicitly. After all, it is possible for a kernel module to simply access these resources without telling the operating system about it. Although system memory is anonymous and may be allocated from anywhere, I/O memory, ports, and interrupts have very specific roles. For instance, a driver needs to be able to allocate the exact ports it needs, not just some ports. But drivers cannot just go about making use of these system resources without first ensuring that they are not already in use elsewhere.
The job of a typical driver is, for the most part, writing and reading I/O ports and I/O memory. Access to I/O ports and I/O memory (collectively called I/O regions) happens both at initialization time and during normal operations.
Unfortunately, not all bus architectures offer a clean way to identify I/O regions belonging to each device, and sometimes the driver must guess where its I/O regions live, or even probe for the devices by reading and writing to "possible" address ranges. This problem is especially true of the ISA bus, which is still in use for simple devices to plug in a personal computer and is very popular in the industrial world in its PC/104 implementation (see Section 15.3 in Chapter 15).
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Automatic and Manual Configuration
Several parameters that a driver needs to know can change from system to system. For instance, the driver must know the hardware's actual I/O addresses, or memory range (this is not a problem with well-designed bus interfaces and only applies to ISA devices). Sometimes you'll need to pass parameters to a driver to help it in finding its own device or to enable/disable specific features.
Depending on the device, there may be other parameters in addition to the I/O address that affect the driver's behavior, such as device brand and release number. It's essential for the driver to know the value of these parameters in order to work correctly. Setting up the driver with the correct values (i.e., configuring it) is one of the tricky tasks that need to be performed during driver initialization.
Basically, there are two ways to obtain the correct values: either the user specifies them explicitly or the driver autodetects them. Although autodetection is undoubtedly the best approach to driver configuration, user configuration is much easier to implement. A suitable trade-off for a driver writer is to implement automatic configuration whenever possible, while allowing user configuration as an option to override autodetection. An additional advantage of this approach to configuration is that the initial development can be done without autodetection, by specifying the parameters at load time, and autodetection can be implemented later.
Many drivers also have configuration options that control other aspects of their operation. For example, drivers for SCSI adapters often have options controlling the use of tagged command queuing, and the Integrated Device Electronics (IDE) drivers allow user control of DMA operations. Thus, even if your driver relies entirely on autodetection to locate hardware, you may want to make other configuration options available to the user.
Parameter values can be assigned at load time by insmod or modprobe
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Doing It in User Space
A Unix programmer who's addressing kernel issues for the first time might well be nervous about writing a module. Writing a user program that reads and writes directly to the device ports is much easier.
Indeed, there are some arguments in favor of user-space programming, and sometimes writing a so-called user-space device driver is a wise alternative to kernel hacking.
The advantages of user-space drivers can be summarized as follows:
  • The full C library can be linked in. The driver can perform many exotic tasks without resorting to external programs (the utility programs implementing usage policies that are usually distributed along with the driver itself).
  • The programmer can run a conventional debugger on the driver code without having to go through contortions to debug a running kernel.
  • If a user-space driver hangs, you can simply kill it. Problems with the driver are unlikely to hang the entire system, unless the hardware being controlled is really misbehaving.
  • User memory is swappable, unlike kernel memory. An infrequently used device with a huge driver won't occupy RAM that other programs could be using, except when it is actually in use.
  • A well-designed driver program can still allow concurrent access to a device.
An example of a user-space driver is the X server: it knows exactly what the hardware can do and what it can't, and it offers the graphic resources to all X clients. Note, however, that there is a slow but steady drift toward frame-buffer-based graphics environments, where the X server acts only as a server based on a real kernel-space device driver for actual graphic manipulation.
Usually, the writer of a user-space driver implements a server process, taking over from the kernel the task of being the single agent in charge of hardware control. Client applications can then connect to the server to perform actual communication with the device; a smart driver process can thus allow concurrent access to the device. This is exactly how the X server works.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Backward Compatibility
The Linux kernel is a moving target—many things change over time as new features are developed. The interface that we have described in this chapter is that provided by the 2.4 kernel; if your code needs to work on older releases, you will need to take various steps to make that happen.
This is the first of many "backward compatibility" sections in this book. At the end of each chapter we'll cover the things that have changed since version 2.0 of the kernel, and what needs to be done to make your code portable.
For starters, the KERNEL_VERSION macro was introduced in kernel 2.1.90. The sysdep.h header file contains a replacement for kernels that need it.
The new resource management scheme brings in a few portability problems if you want to write a driver that can run with kernel versions older than 2.4. This section discusses the portability problems you'll encounter and how the sysdep.h header tries to hide them.
The most apparent change brought about by the new resource management code is the addition of request_mem_region and related functions. Their role is limited to accessing the I/O memory database, without performing specific operations on any hardware. What you can do with earlier kernels, thus, is to simply not call the functions. The sysdep.h header easily accomplishes that by defining the functions as macros that return 0 for kernels earlier than 2.4.
Another difference between 2.4 and earlier kernel versions is in the actual prototypes of request_region and related functions.
Kernels earlier than 2.4 declared both request_region and release_region as functions returning void (thus forcing the use of check_region beforehand). The new implementation, more correctly, has functions that return a pointer value so that an error condition can be signaled (thus making
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Quick Reference
This section summarizes the kernel functions, variables, macros, and /proc files that we've touched on in this chapter. It is meant to act as a reference. Each item is listed after the relevant header file, if any. A similar section appears at the end of every chapter from here on, summarizing the new symbols introduced in the chapter.
__KERNEL__
MODULE
Preprocessor symbols, which must both be defined to compile modularized kernel code.
__SMP__
A preprocessor symbol that must be defined when compiling modules for symmetric multiprocessor systems.
int init_module(void);
void cleanup_module(void);
Module entry points, which must be defined in the module object file.
#include <linux/init.h>
module_init(init_function);
module_exit(cleanup_function);
The modern mechanism for marking a module's initialization and cleanup functions.
#include <linux/module.h>
Required header. It must be included by a module source.
MOD_INC_USE_COUNT;
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 3: Char Drivers
The goal of this chapter is to write a complete char device driver. We'll develop a character driver because this class is suitable for most simple hardware devices. Char drivers are also easier to understand than, for example, block drivers or network drivers. Our ultimate aim is to write a modularized char driver, but we won't talk about modularization issues in this chapter.
Throughout the chapter, we'll present code fragments extracted from a real device driver: scull, short for Simple Character Utility for Loading Localities. scull is a char driver that acts on a memory area as though it were a device. A side effect of this behavior is that, as far as scull is concerned, the word device can be used interchangeably with "the memory area used by scull."
The advantage of scull is that it isn't hardware dependent, since every computer has memory. scull just acts on some memory, allocated using kmalloc. Anyone can compile and run scull, and scull is portable across the computer architectures on which Linux runs. On the other hand, the device doesn't do anything "useful" other than demonstrating the interface between the kernel and char drivers and allowing the user to run some tests.
The first step of driver writing is defining the capabilities (the mechanism) the driver will offer to user programs. Since our "device" is part of the computer's memory, we're free to do what we want with it. It can be a sequential or random-access device, one device or many, and so on.
To make scull be useful as a template for writing real drivers for real devices, we'll show you how to implement several device abstractions on top of the computer memory, each with a different personality.
The scull source implements the following devices. Each kind of device implemented by the module is referred to as a type:
scull0 to scull3
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
The Design of scull
The first step of driver writing is defining the capabilities (the mechanism) the driver will offer to user programs. Since our "device" is part of the computer's memory, we're free to do what we want with it. It can be a sequential or random-access device, one device or many, and so on.
To make scull be useful as a template for writing real drivers for real devices, we'll show you how to implement several device abstractions on top of the computer memory, each with a different personality.
The scull source implements the following devices. Each kind of device implemented by the module is referred to as a type:
scull0 to scull3
Four devices each consisting of a memory area that is both global and persistent. Global means that if the device is opened multiple times, the data contained within the device is shared by all the file descriptors that opened it. Persistent means that if the device is closed and reopened, data isn't lost. This device can be fun to work with, because it can be accessed and tested using conventional commands such as cp, cat, and shell I/O redirection; we'll examine its internals in this chapter.
scullpipe0 to scullpipe3
Four FIFO (first-in-first-out) devices, which act like pipes. One process reads what another process writes. If multiple processes read the same device, they contend for data. The internals of scullpipe will show how blocking and nonblocking read and write can be implemented without having to resort to interrupts. Although real drivers synchronize with their devices using hardware interrupts, the topic of blocking and nonblocking operations is an important one and is separate from interrupt handling (covered in Chapter 9).
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Major and Minor Numbers
Char devices are accessed through names in the filesystem. Those names are called special files or device files or simply nodes of the filesystem tree; they are conventionally located in the /dev directory. Special files for char drivers are identified by a "c" in the first column of the output of ls -l. Block devices appear in /dev as well, but they are identified by a "b." The focus of this chapter is on char devices, but much of the following information applies to block devices as well.
If you issue the ls -l command, you'll see two numbers (separated by a comma) in the device file entries before the date of last modification, where the file length normally appears. These numbers are the major device number and minor device number for the particular device. The following listing shows a few devices as they appear on a typical system. Their major numbers are 1, 4, 7, and 10, while the minors are 1, 3, 5, 64, 65, and 129.
 crw-rw-rw- 1 root   root    1, 3   Feb 23 1999  null
 crw------- 1 root   root   10, 1   Feb 23 1999  psaux
 crw------- 1 rubini tty     4, 1   Aug 16 22:22 tty1
 crw-rw-rw- 1 root   dialout 4, 64  Jun 30 11:19 ttyS0
 crw-rw-rw- 1 root   dialout 4, 65  Aug 16 00:00 ttyS1
 crw------- 1 root   sys     7, 1   Feb 23 1999  vcs1
 crw------- 1 root   sys     7, 129 Feb 23 1999  vcsa1
 crw-rw-rw- 1 root   root    1, 5   Feb 23 1999  zero
The major number identifies the driver associated with the device. For example, /dev/null and /dev/zero are both managed by driver 1, whereas virtual consoles and serial terminals are managed by driver 4; similarly, both vcs1 and vcsa1 devices are managed by driver 7. The kernel uses the major number at open time to dispatch execution to the appropriate driver.
The minor number is used only by the driver specified by the major number; other parts of the kernel don't use it, and merely pass it along to the driver. It is common for a driver to control several devices (as shown in the listing); the minor number provides a way for the driver to differentiate among them.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
File Operations
In the next few sections, we'll look at the various operations a driver can perform on the devices it manages. An open device is identified internally by a file structure, and the kernel uses the file_operations structure to access the driver's functions. The structure, defined in <linux/fs.h>, is an array of function pointers. Each file is associated with its own set of functions (by including a field called f_op that points to a file_operations structure). The operations are mostly in charge of implementing the system calls and are thus named open, read, and so on. We can consider the file to be an "object" and the functions operating on it to be its "methods," using object-oriented programming terminology to denote actions declared by an object to act on itself. This is the first sign of object-oriented programming we see in the Linux kernel, and we'll see more in later chapters.
Conventionally, a file_operations structure or a pointer to one is called fops (or some variation thereof); we've already seen one such pointer as an argument to the register_chrdev call. Each field in the structure must point to the function in the driver that implements a specific operation, or be left NULL for unsupported operations. The exact behavior of the kernel when a NULL pointer is specified is different for each function, as the list later in this section shows.
The file_operations structure has been slowly getting bigger as new functionality is added to the kernel. The addition of new operations can, of course, create portability problems for device drivers. Instantiations of the structure in each driver used to be declared using standard C syntax, and new operations were normally added to the end of the structure; a simple recompilation of the drivers would place a NULL value for that operation, thus selecting the default behavior, usually what you wanted.
Since then, kernel developers have switched to a "tagged" initialization format that allows initialization of structure fields by name, thus circumventing most problems with changed data structures. The tagged initialization, however, is not standard C but a (useful) extension specific to the GNU compiler. We will look at an example of tagged structure initialization shortly.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
The file Structure
struct file, defined in <linux/fs.h>, is the second most important data structure used in device drivers. Note that a file has nothing to do with the FILEs of user-space programs. A FILE is defined in the C library and never appears in kernel code. A struct file, on the other hand, is a kernel structure that never appears in user programs.
The file structure represents an open file. (It is not specific to device drivers; every open file in the system has an associated struct file in kernel space.) It is created by the kernel on open and is passed to any function that operates on the file, until the last close. After all instances of the file are closed, the kernel releases the data structure. An open file is different from a disk file, represented by struct inode.
In the kernel sources, a pointer to struct file is usually called either file or filp ("file pointer"). We'll consistently call the pointer filp to prevent ambiguities with the structure itself. Thus, file refers to the structure and filp to a pointer to the structure.
The most important fields of struct file are shown here. As in the previous section, the list can be skipped on a first reading. In the next section though, when we face some real C code, we'll discuss some of the fields, so they are here for you to refer to.
mode_t f_mode;
The file mode identifies the file as either readable or writable (or both), by means of the bits FMODE_READ and FMODE_WRITE. You might want to check this field for read/write permission in your ioctl function, but you don't need to check permissions for read and write because the kernel checks before invoking your method. An attempt to write without permission, for example, is rejected without the driver even knowing about it.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
open and release
Now that we've taken a quick look at the fields, we'll start using them in real scull functions.
The open method is provided for a driver to do any initialization in preparation for later operations. In addition, open usually increments the usage count for the device so that the module won't be unloaded before the file is closed. The count, described in Section 2.4.2 in Chapter 2, is then decremented by the release method.
In most drivers, open should perform the following tasks:
  • Increment the usage count
  • Check for device-specific errors (such as device-not-ready or similar hardware problems)
  • Initialize the device, if it is being opened for the first time
  • Identify the minor number and update the f_op pointer, if necessary
  • Allocate and fill any data structure to be put in filp->private_data
In scull, most of the preceding tasks depend on the minor number of the device being opened. Therefore, the first thing to do is identify which device is involved. We can do that by looking at inode->i_rdev.
We've already talked about how the kernel doesn't use the minor number of the device, so the driver is free to use it at will. In practice, different minor numbers are used to access different devices or to open the same device in a different way. For example, /dev/st0 (minor number 0) and /dev/st1 (minor 1) refer to different SCSI tape drives, whereas /dev/nst0 (minor 128) is the same physical device as /dev/st0, but it acts differently (it doesn't rewind the tape when it is closed). All of the tape device files have different minor numbers, so that the driver can tell them apart.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
scull's Memory Usage
Before introducing the read and write operations, we'd better look at how and why scull performs memory allocation. "How" is needed to thoroughly understand the code, and "why" demonstrates the kind of choices a driver writer needs to make, although scull is definitely not typical as a device.
This section deals only with the memory allocation policy in scull and doesn't show the hardware management skills you'll need to write real drivers. Those skills are introduced in Chapter 8, and in Chapter 9. Therefore, you can skip this section if you're not interested in understanding the inner workings of the memory-oriented scull driver.
The region of memory used by scull, also called a device here, is variable in length. The more you write, the more it grows; trimming is performed by overwriting the device with a shorter file.
The implementation chosen for scull is not a smart one. The source code for a smart implementation would be more difficult to read, and the aim of this section is to show read and write, not memory management. That's why the code just uses kmalloc and kfree without resorting to allocation of whole pages, although that would be more efficient.
On the flip side, we didn't want to limit the size of the "device" area, for both a philosophical reason and a practical one. Philosophically, it's always a bad idea to put arbitrary limits on data items being managed. Practically, scull can be used to temporarily eat up your system's memory in order to run tests under low-memory conditions. Running such tests might help you understand the system's internals. You can use the command cp /dev/zero /dev/scull0 to eat all the real RAM with scull, and you can use the dd utility to choose how much data is copied to the scull device.
In scull, each device is a linked list of pointers, each of which points to a Scull_Dev structure. Each such structure can refer, by default, to at most four million bytes, through an array of intermediate pointers. The released source uses an array of 1000 pointers to areas of 4000 bytes. We call each memory area a
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
A Brief Introduction to Race Conditions
Content preview·