BUY THIS BOOK

Safari Books Online

What is this?

Looking to Reprint this content?


Linux Device Drivers
Linux Device Drivers By Alessandro Rubini
February 1998
Pages: 439

Cover | Table of Contents | Colophon


Table of Contents

Chapter 1: An Introduction to the Linux Kernel
People all around the world are delving into the Linux kernel, mostly to write device drivers. While each driver is different, and you have to know your specific device, many principles and basic techniques are the same from one driver to another. In this book, you'll learn to write your own device drivers and to hack around in related parts of the kernel. This book covers device-independent programming techniques, without binding the examples to any specific device.
This chapter doesn't actually get into writing code. However, I'm going to introduce some background concepts about the Linux kernel that you'll be glad you know later, when we do launch into writing code.
As you learn to write drivers, you will find out a lot about the Linux kernel in general; this may help you understand how your machine works and why things aren't always as fast as you expect or don't do quite what you want. We'll introduce new ideas smoothly, starting off with very simple drivers and building upon them; every new concept will be accompanied by sample code that doesn't need special hardware to be tested.
As a programmer, you will be able to make your own choices about your driver, choosing an acceptable tradeoff between the programming time required and the flexibility of the result. Though it may appear strange to say that a driver is ``flexible,'' I like this word because it emphasizes that the role of a device driver is providing mechanisms, not policies.
The distinction between mechanism and policy is one of the best ideas behind the Unix design. Most programming problems can indeed be split into two parts: ``what needs to be done'' (the mechanism) and ``how can the program be used'' (the policy). If the two issues are addressed by different parts of the program, or even by different programs altogether, the software package is much easier to develop and to adapt to particular needs.
For example, Unix management of the graphic display is split between the X server, which knows the hardware and offers a unified interface to user programs, and the window manager, which implements a particular policy without knowing anything about the hardware. People can use the same window manager on different hardware, and different users can run different configurations on the same workstation. Another example is the layered structure of TCP/IP networking: the operating system offers the socket abstraction, which is policy-free, while different servers are in charge of the services. Moreover, a server like
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
The Role of the Driver Writer
As a programmer, you will be able to make your own choices about your driver, choosing an acceptable tradeoff between the programming time required and the flexibility of the result. Though it may appear strange to say that a driver is ``flexible,'' I like this word because it emphasizes that the role of a device driver is providing mechanisms, not policies.
The distinction between mechanism and policy is one of the best ideas behind the Unix design. Most programming problems can indeed be split into two parts: ``what needs to be done'' (the mechanism) and ``how can the program be used'' (the policy). If the two issues are addressed by different parts of the program, or even by different programs altogether, the software package is much easier to develop and to adapt to particular needs.
For example, Unix management of the graphic display is split between the X server, which knows the hardware and offers a unified interface to user programs, and the window manager, which implements a particular policy without knowing anything about the hardware. People can use the same window manager on different hardware, and different users can run different configurations on the same workstation. Another example is the layered structure of TCP/IP networking: the operating system offers the socket abstraction, which is policy-free, while different servers are in charge of the services. Moreover, a server like ftpd provides the file transfer mechanism, while users can use whatever client they prefer; both command-line and graphic clients exist, and anyone can write a new user interface to transfer files.
Where drivers are concerned, the same role-splitting applies. The floppy driver is policy-free—its role is only to show the diskette as a continuous byte array. How to use the device is the role of the application: tar writes it sequentially, while mkfs prepares the device to be mounted, and mcopy relies on the existence of a specific data structure on the device.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Splitting the Kernel
In a Unix system, several concurrent processes attend to different tasks. Each process asks for system resources, be it computing power, memory, network connectivity, or some other resource. The kernel is the big chunk of executable code in charge of handling all such requests. Though the distinction between the different kernel tasks isn't always clearly marked, the kernel's role can be split, as shown in Figure 1.1, into the following parts:
Figure 1.1: A split view of the kernel
Process management
The kernel is in charge of creating and destroying processes, and handling their connection to the outside world (input and output). Communication among different processes (through signals, pipes, or interprocess communication primitives) is basic to the overall system functionality, and is also handled by the kernel. In addition, the scheduler, probably the most critical routine in the whole operating system, is part of process management. More generally, the kernel's process management activity implements the abstraction of several processes on top of a single CPU.
Memory management
The computer's memory is a major resource, and the policy used to deal with it is a critical one for system performance. The kernel builds up a virtual addressing space for any and all processes on top of the limited available resources. The different parts of the kernel interact with the memory-management subsystem through a set of function calls, ranging from simple malloc/free equivalents to much more exotic functionalities.
Filesystems
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Classes of Devices and Modules
The Unix way of looking at devices distinguishes between three device types, each devoted to a different task. Linux can load each device type in the form of a module, thus allowing users to experiment with new hardware while still being able to run up-to-date kernel versions and to follow development.
As far as modules are concerned, each module usually implements only one driver, and thus is classifiable, for example, as a char module, or a block module. This division of modules into different types, or classes, is not a rigid one; the programmer can choose to build huge modules implementing different drivers in a single chunk of code. Good programmers, nonetheless, usually create a different module for each new functionality they implement.
Going back to devices, the three flavors are the following:
Character devices
A character (char) device is one that can be accessed like a file, and a char driver is in charge of implementing this behavior. Such a driver usually implements the open, close, read, and write system calls. The console and the parallel ports are examples of char devices, as they are well represented by the stream abstraction. Char devices are accessed by means of filesystem nodes, such as /dev/tty1 and /dev/lp1. The only relevant difference between a char device and a regular file is that you can always step back and forth in the regular file, while most char devices are just a data channel, which you can only access sequentially. There exist, nonetheless, char devices that look like a data area, and you can step back and forth in them.
Block devices
A block device is something that can host a filesystem, such as a disk. In most Unix systems, a block device can only be accessed as multiples of a block, where a block is usually one kilobyte of data. Linux allows you to read and write a block device like a char device--it permits the transfer of any number of bytes at a time. As a result, block and char devices differ only in the way data is managed internally by the kernel, and thus in the kernel/driver software interface. Like a char device, each block device is accessed through a filesystem node and the difference between them is transparent to the user. A block driver interfaces with the kernel through the same interface as a char driver, as well as through an additional block-oriented interface that is invisible to the user or application.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Security Issues
Talking about security issues is fashionable these days, and most programmers are concerned about their systems' security, so I'll address the problem at the beginning to avoid later misunderstandings.
Security has two faces. One problem is what a user can achieve through the misuse of existing programs, or by exploiting bugs; a different issue is what kind of (mis)functionality a programmer can implement. The programmer has, obviously, much more power than a plain user. In other words, it's more dangerous to run as root a program you got from a friend than to give him or her a root shell once in a while. Although having access to a compiler is not a security hole per se, the hole can appear when compiled code is actually executed; be careful with modules, because a kernel module can do anything. A module is much more powerful than a superuser shell, in that its privileged status is acknowledged by the CPU.
Any security check in the system is enforced by kernel code. If the kernel has security holes, then the system has holes. In the official kernel distribution, only root can load modules; the system call create_module checks the user ID of the invoking process. Thus with the official kernel, only the superuser, or an intruder who has succeeded in becoming root, can exploit the power of privileged code.
Fortunately, when writing a device driver or other module, there's little need to be concerned about security because processes accessing the device are already constrained by more general blocking techniques. With block devices, for example, security is handled by the permissions on the filesystem node and the mount command, so usually nothing has to be checked in the actual block driver.
Be careful, however, when receiving software from third parties, especially when the kernel is concerned: since everybody has access to the source code, everybody can break and recompile things. While you can trust precompiled kernels found in your distribution, you should avoid running kernels compiled by an untrusted friend—if you wouldn't run a precompiled binary as root, then you'd better not run a precompiled kernel. For example, a maliciously modified kernel could allow anyone to load a module, thus opening an unexpected back door via
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Version Numbering
As the last point before digging in to programming, I'd like to comment on the unusual version numbering scheme used in Linux and what version this book refers to.
First of all, note that every software package used in a Linux system has its own release number, and there are often interdependencies across them: you need a particular version of one package to run a particular version of another package. The creators of Linux distributions usually handle the messy problem of matching packages, and the user who installs from a prepackaged distribution doesn't need to deal with version numbers. Those who replace and upgrade system software, on the other hand, are on their own. Fortunately, some modern distributions allow the upgrade of single packages by checking interpackage dependencies, and this greatly simplifies things for the user who needs to keep system software up to date.
In this book, I'll assume you have version 2.6.3 or newer of the gcc compiler, version 1.3.57 or newer of the module utilities, and a recent-enough version of the GNU tools (the most important being gmake) for program development. Those requirements aren't particularly strict, as nearly every Linux installation is equipped with GNU tools, and these versions are relatively old (besides, kernel versions 2.0 and later refuse to compile with a gcc older than 2.6). Note that recent kernels include a file called Documentation/Changes, which lists the software needed to proficiently compile and run that kernel version. This file is missing from the 1.2 sources.
As far as the kernel is concerned, I'll concentrate on the 2.0.x and 1.2.13 versions, trying to write code that can work with both of them.
The even-numbered kernel versions (i.e., 1.2.x and 2.0.x) are the stable ones and are intended for general distribution. The odd versions, on the contrary, are development snapshots and are quite ephemeral; the latest of them represents the current status of development, but becomes obsolete in a few days.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
License Terms
Linux is licensed with the GNU ``General Public License'' (GPL), a document devised for the GNU project by the Free Software Foundation. The GPL allows anybody to redistribute, and also sell, a GPL'd product, as long as the recipient is allowed to rebuild an exact copy of the binary files from source. Additionally, any software product derived from a GPL'd product must be released under the GPL.
The main goal of such a licence is to allow the growth of knowledge by permitting everybody to modify programs at will; at the same time, people selling software to the public can still do their job. Despite this simple objective, there's an ongoing discussion about the GPL and its use. If you want to read the license, you can find it in several places in your system, including the directory /usr/src/linux, as a file called COPYING.
As far as third-party and custom modules are concerned, they're not part of the Linux kernel, and thus you're not forced to license them under the GPL. A module uses the kernel through a well-defined interface, but is not part of it, similar to the way user programs use the kernel through system calls.
In brief, if your code goes in the kernel, you must use the GPL as soon as you release the code. Although personal use of your changes doesn't force the GPL on, if you distribute your code you must include the source code in the distribution—people acquiring your package must be allowed to rebuild the binary at will. If you write a module, on the other hand, you are allowed to distribute it in binary form. However, this is not always practical, as modules should in general be recompiled for each kernel version that they will be linked with (as explained in Chapter 2, in the section Section 2.2.1, and Chapter 11, in the section Section 11.2). The common objection to binary distribution of modules is that a module embeds code defined or declared in the kernel headers; this objection doesn't apply, however, because header files are part of the public interface of the kernel, and thus are not subject to licensing.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Overview of the Book
From here on, we enter the world of kernel programming. Chapter 2 introduces modularization, explaining the secrets of the art and showing the code for running modules. Chapter 3, talks about char drivers and shows the complete code for a memory-based device driver that can be read and written for fun. Using memory as the hardware base for the device allows anyone to run the sample code without the need to acquire special hardware.
Debugging techniques are vital tools for the programmer and are introduced in Chapter 4. Then, with our new debugging skills, we'll move to advanced features of char drivers, such as blocking operations, the use of select and the ever-popular ioctl call; these topics are the subject of Chapter 5.
Before dealing with hardware management, we'll dissect a few more of the kernel's software interfaces: Chapter 6, shows how time is managed in the kernel, and Chapter 7, explains memory allocation.
Next we focus on hardware: Chapter 8, describes the management of I/O ports and memory buffers that live on the device; after that comes interrupt handling, in Chapter 9. Unfortunately, not everyone will be able to run the sample code for these chapters, because some hardware support is actually needed to test the software interface to interrupts. I've tried my best to keep required hardware support to a minimum, but you still need to put your hands on the soldering iron to build your hardware ``device.'' The device is a single jumper wire that plugs into the parallel port, so I hope this is not a problem.
Chapter 10, offers some additional suggestions about writing kernel software and about portability issues.
In the second part of this book, we get more ambitious; thus Chapter 11 starts over with modularization issues, going deeper into the topic.
Chapter 12, then describes how block drivers are implemented, outlining the aspects that differentiate them from char drivers. Following that, Chapter 13, explains what we left out from the previous treatment of memory management:
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 2: Building and Running Modules
It's high time now to begin programming. This chapter is going to introduce all the essential concepts about modules and kernel programming. In these few pages, we'll build and run a complete module. Building such expertise is an essential foundation for any kind of modularized driver. To avoid throwing in too many concepts, this chapter only talks about modules, without referring to any device class.
All the kernel items (functions, variables, header files, and macros) that are introduced here are described in a reference section at the end of the chapter.
For the impatient reader, the following code is a complete ``Hello, World'' module (which does nothing in particular). This code will compile and run under Linux 2.0 and later versions, but not under 1.2, as explained later in this chapter.

#define MODULE
#include <linux/module.h>

int init_module(void)      { printk("<1>Hello, world\n"); return 0; }
void cleanup_module(void)  { printk("<1>Goodbye cruel world\n"); }










The printk function is defined in the Linux kernel and behaves similarly to printf; the module can call printk, because after insmod has loaded it, the module is linked to the kernel and can access its symbols. The string <1> is the priority of the message. I've specified a high priority in this module because a message with the default priority might not show on the console if you use version 2.0.x of the kernel and an old klogd daemon (you can ignore this issue for now; we'll explain it in the section Section 4.1.1, in Chapter 4).
You can test the module by calling insmod and rmmod, as shown in the screen dump below. Note that only the superuser can load and unload a module.
root# gcc -c hello.c
root# insmod hello.o
Hello, world
root# rmmod hello
Goodbye cruel world
root#
As you see, writing a module is easy. We'll go deeper into the topic throughout this chapter.
Before we go further, it's worth underlining the various differences between a kernel module and an application.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Modules Versus Applications
Before we go further, it's worth underlining the various differences between a kernel module and an application.
While an application performs a single task from beginning to end, a module registers itself in order to serve future requests, and its ``main'' function terminates immediately. In other words, the task of init_module() (the module's entry point) is to prepare for later invocation of the module's functions; it's as though the module is saying, ``Here I am, and this is what I can do.'' The second entry point of a module, cleanup_module, gets invoked just before the module is unloaded. It should tell the kernel, ``I'm not there any more, don't ask me to do anything else.'' The ability to unload a module is one of the features of modularization that you'll most appreciate, because it helps cut down development time; you can test successive versions of your new driver without going through the lengthy shutdown/reboot cycle each time.
As a programmer, you know that an application can call functions it doesn't define: the linking stage resolves external references using the appropriate library of functions. printf is one of those callable functions and is defined in libc. A module, on the other hand, is linked only to the kernel, and the only functions it can call are the ones exported by the kernel. The printk function used in hello.c above, for example, is the version of printf defined within the kernel and exported to modules; it behaves exactly like the original function, except that it has no floating-point support.
Figure 2.1 shows how function calls and function pointers are used in a module to add new functionality to a running kernel.
Figure 2.1: Linking a module to the kernel
Since no library is linked to modules, source files should never include the usual header files. Anything related to the kernel is declared in headers found in
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Compiling and Loading
The rest of this chapter is devoted to writing a complete, though typeless, module. That is, the module will not belong to any of the classes listed in Section 1.3, in Chapter 1. The sample driver shown in this chapter is called skull, short for "Simple Kernel Utility for Loading Localities." You can reuse the skull source to load your own local code to the kernel, after removing the sample functionality it offers.
Before we deal with the roles of init_module and cleanup_module, however, we'll write a Makefile that builds object code that the kernel can load.
First, we need to define the __KERNEL__ symbol in the preprocessor before we include any headers. This symbol is used to select which parts of the headers are actually used. Applications end up including kernel headers because libc includes them, but the applications don't need all the kernel prototypes. Therefore, __KERNEL__ is used to mask the extra ones out via #ifdef. Exporting kernel symbols and macros to user-space programs would greatly contribute to program namespace pollution. If you are compiling for an SMP (Symmetric Multi-Processor) machine, you also need to define __SMP__ before including the kernel headers. This requirement may seem unfriendly, but is going to disappear as soon as the developers find the right way to be SMP-transparent.
Another important symbol is MODULE, which must be defined before including <linux/module.h>. This symbol is always defined, except when compiling drivers that are directly linked to the kernel image. Since none of the drivers covered in this book are directly linked to the kernel, they all define the symbol.
A module writer must also specify the -O flag to the compiler, because many functions are declared as inline in the header files. gcc doesn't expand inlines unless optimization is enabled, but it can accept both the -g and -O options, allowing you to debug code that uses inline functions.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
The Kernel Symbol Table
We've seen how insmod resolves undefined symbols against the table of public kernel symbols. The table contains global kernel items—functions and variables—that are needed to implement modularized drivers. The public symbol table can be read in text form from the file /proc/ksyms.
When your module is loaded, any global symbol you declare becomes part of the kernel symbol table, and you can see it appear in /proc/ksyms or in the output of the ksyms command.
New modules can use symbols exported by your module, and you can stack new modules on top of other modules. Module stacking is implemented in the mainstream kernel sources as well: the msdos filesystem relies on symbols exported by the fat module, and the ppp driver stacks on the header compression module.
Module stacking is useful in complex projects. If a new abstraction is implemented in the form of a device driver, it might offer a plug for hardware-specific implementations. For example, a frame buffer video driver can export symbols to be used by a lower-level VGA driver. Each user loads the frame buffer module and the specific VGA module for his or her installed hardware.
Layered modularization can help reduce development time by simplifying each layer. This is similar to the separation between mechanism and policy that we discussed in Chapter 1.
An alternative to exporting all the global symbols of your module is to use the function register_symtab, which is the official kernel interface to symbol-table management. The programming interface described here applies to the 1.2.13 and 2.0 kernels. See the section Section 17.1 in Chapter 17, for details about changes introduced in the 2.1 development kernels.
The function register_symtab, as its name suggests, is used to register a whole symbol table in the kernel's main table. This technique is somewhat cleaner than relying on static and global symbols, in that the programmer centralizes the information about what is made available to other modules and what isn't. This is a better approach than scattering
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Initialization and Shutdown
As already suggested, init_module registers any facility offered by the module. By ``facility,'' I mean a new functionality, be it a whole driver or a new software abstraction, that can be accessed by an application.
Registration of a new facility is performed by calling a kernel function. The arguments passed are usually a pointer to a data structure describing the new facility and the name of the facility being registered. The data structure usually embeds pointers to module functions, which is how functions in the module body get called.
In addition to the ``main'' facilities that are used to identify the class of each module (such as char and block drivers), a module can register the following items:
Miscellaneous devices
These were once called mice because this kind of facility was only used by bus-mice. They are spare devices, generally simpler than full-featured ones.
Serial ports
A serial driver can be added to the system at run time; this is how PCMCIA modems are supported.
Line disciplines
The line discipline is a software layer that handles the tty data streams. A module can register a new line discipline to handle tty transactions in a non-standard way. The kmouse module, for example, uses a discipline to steal incoming data from serial mice.
tty drivers
A tty driver is the set of functions that implement low-level data handling over a tty. Both the console and the serial driver register their drivers in order to create terminal devices. Multiport serial ports have their own driver as well.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Using Resources
A module can't accomplish its task without using system resources, such as memory, I/O ports, and interrupt lines, as well as DMA channels if you use the mainboard's DMA controller.
As a programmer, you are already accustomed to managing memory allocation, and writing kernel code is no different in this regard. Your program obtains a memory area using kmalloc and releases it using kfree. These functions behave like malloc and free, except that kmalloc takes an additional argument, the priority. Most of the time, a priority of GFP_KERNEL will do. The GFP acronym stands for ``Get Free Page.''
Requesting I/O ports and interrupt lines, on the other hand, looks strange at first, because normally a programmer accesses them with explicit instructions in the code, without telling the operating system about it. ``Allocating'' ports and interrupts is different from memory allocation in that memory is allocated from a pool, and every address behaves the same; I/O ports have individual roles, and a driver needs to work with specific ports, not just some ports.
The job of a typical driver is, for the most part, writing and reading ports. This is true both at initialization time and during normal work. A device driver must be guaranteed exclusive access to its ports in order to prevent interference from other drivers—if a module probing for its hardware should happen to write to ports owned by another device, weird things would undoubtedly happen.
The developers of Linux chose to implement a request/free mechanism for ports, mainly as a way to prevent collisions between different devices. However, unauthorized port access doesn't produce any error condition equivalent to ``segmentation fault''—the hardware can't enforce port registration.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Automatic and Manual Configuration
Several parameters that a driver needs to know can change from system to system. For instance, the driver must know the hardware's actual I/O addresses, or memory range.
Note that most of the problems discussed in this section don't apply to PCI devices (described in Chapter 15.
Depending on the device, there may be other parameters in addition to the I/O address that affect the driver's behavior, such as device brand and release number. It's essential for the driver to know the value of these parameters in order to work correctly. Setting up the driver with the correct values (i.e., configuring it) is one of the tricky tasks that need to be performed during driver initialization.
Basically, there are two ways to obtain the correct values: either the user specifies them explicitly or the driver autodetects them. While autodetection is undoubtedly the best approach to driver configuration, user configuration is much easier to implement; a suitable tradeoff for a driver writer is to implement automatic configuration whenever possible, while allowing user configuration as an option to override autodetection. An additional advantage of this approach to configuration is that the initial development can be done without autodetection, by specifying the parameters at load time, and autodetection can be implemented later.
Parameter values can be assigned at load time by insmod, which accepts specification of integer and string values on the command line. The command can modify all the global variables defined in the module. For example, if your source contains the variables:
int skull_ival=0;
char *skull_sval;
then the following command can be used to load the module:
insmod skull skull_ival=666 skull_sval="the beast"
A sample run using printk will show that the assignments are already in effect when init_module gets invoked. Note that insmod can assign a value to any integer or char-pointer variable in the module. It does this for both static and global variables, whether they are part of the public symbol table or not. Strings declared as arrays, on the other hand, can't be assigned at load time because the pointer is resolved at compile time and can't be changed after that.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Doing It in User Space
At this point, a Unix programmer who's addressing kernel issues for the first time might well be nervous about writing a module. Writing a user program that reads and writes directly to the device ports is much easier.
Indeed, there are some arguments in favor of user-space programming, and sometimes writing a so-called ``user-space device driver'' is a wise alternative to kernel hacking.
The advantages of user-space drivers can be summarized as follows:
  • The full C library can be linked in. The driver can perform many exotic tasks without resorting to external programs (the utility programs implementing usage policies that are usually distributed along with the driver itself).
  • A conventional debugger can be run on the driver code, without having to go through contortions to debug a running kernel.
  • If a user-space driver hangs, you can simply kill it. Problems with the driver are unlikely to hang the entire system, unless the hardware being controlled is really misbehaving.
  • User memory is swappable, unlike kernel memory. An infrequently used device with a huge driver won't occupy RAM that other programs could be using, except when it is actually in use.
  • A well-designed driver program can still allow concurrent access to a device.
An example of a user-space driver is the X server: it knows exactly what the hardware can do and what it can't, and it offers the graphic resources to all X clients. The library libsvga is another similar beastie.
Usually, the writer of a user-space driver implements a ``server'' process, taking over from the kernel the task of being ``the single agent in charge of hardware control.'' Client applications can then connect to the server to perform actual communication with the device; a smart driver process can thus allow concurrent access to the device. This is exactly how the X server works.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Quick Reference
This section summarizes the kernel functions, variables, macros, and /proc files that we've touched on in this chapter. It is meant to act as a reference. Each item is listed after the relevant header file, if any. A similar section appears at the end of every chapter from here on, summarizing the new symbols introduced in the chapter.
__KERNEL__
MODULE
Preprocessor symbols, which must both be defined to compile modularized kernel code.
int init_module(void);
void cleanup_module(void);
Module entry points, which must be defined in the module object file.
#include <linux/module.h>
Required header. It must be included by a module source.
MOD_INC_USE_COUNT;
MOD_DEC_USE_COUNT;
MOD_IN_USE
Macros that act on the usage count.
/proc/modules
The list of currently loaded modules. Entries contain the module name, the amount of memory they occupy, and the usage count. Extra strings are appended to each line to specify flags that are currently active for the module.
int register_symtab(struct symbol_table *);
Function used to specify the set of public symbols in the module. This function doesn't exist any more in Linux 2.1.18 and newer kernels. See Section 17.1 in Chapter 17 for details.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 3: Char Drivers
The goal of this chapter is to write a complete char device driver. We'll develop a character driver because this class is suitable for most simple hardware devices. Char drivers are also easier to understand than, for example, block drivers. Our ultimate aim is to write a modularized char driver, but I won't talk about modularization issues in this chapter.
Throughout the chapter, I'll present code fragments extracted from a real device driver: scull, short for "Simple Character Utility for Loading Localities." scull is a char driver that acts on a memory area as though it were a device. A side effect of this behavior is that as far as scull is concerned, the word ``device'' can be used interchangeably with ``the memory area used by scull.''
The advantage of scull is that it isn't hardware dependent, since every computer has memory. scull just acts on some memory, which is allocated using kmalloc. Anyone can compile and run scull, and scull is portable across the computer architectures on which Linux runs. On the other hand, the device doesn't do anything ``useful'' other than demonstrating the interface between the kernel and char drivers and allowing the user to run some tests.
The first step of driver writing is defining the capabilities (the ``mechanism'') the driver will offer to user programs. Since our ``device'' is part of the computer's memory, we're free to do what we want with it. It can be a sequential or random-access device, one device or many, and so on.
In order for scull to be useful as a template for writing real drivers for real devices, I'll show you how to implement several device abstractions on top of the computer memory, each with a different personality.
The scull source implements the following devices. Each kind of device implemented by the module is referred to as a ``type'':
scull0-3
Four devices consisting of four memory areas that are both global and persistent. ``Global'' means that if the device is opened multiple times, the data is shared by all the file descriptors that opened it. ``Persistent'' means that if the device is closed and reopened, data isn't lost. This device can be fun to work with, because it can be accessed and tested using conventional commands, like
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
The Design of scull
The first step of driver writing is defining the capabilities (the ``mechanism'') the driver will offer to user programs. Since our ``device'' is part of the computer's memory, we're free to do what we want with it. It can be a sequential or random-access device, one device or many, and so on.
In order for scull to be useful as a template for writing real drivers for real devices, I'll show you how to implement several device abstractions on top of the computer memory, each with a different personality.
The scull source implements the following devices. Each kind of device implemented by the module is referred to as a ``type'':
scull0-3
Four devices consisting of four memory areas that are both global and persistent. ``Global'' means that if the device is opened multiple times, the data is shared by all the file descriptors that opened it. ``Persistent'' means that if the device is closed and reopened, data isn't lost. This device can be fun to work with, because it can be accessed and tested using conventional commands, like cp, cat, and the shell I/O redirection; we'll examine its internals in this chapter.
scullpipe0-3
Four ``fifo'' devices, which act like pipes. One process reads what another process is writing. If more processes read the same device, they contend for data. The internals of scullpipe will show how blocking and nonblocking read and write can be implemented; this happens without having to resort to interrupts. Although real drivers synchronize with their devices using hardware interrupts, the topic of blocking and nonblocking operations is an important one and is conceptually detached from interrupt handling (covered in Chapter 9).
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Major and Minor Numbers
Char devices are accessed through names (or ``nodes'') in the filesystem, usually located in the /dev directory. Device files are special files and are identified by a ``c'' in the first column of the output of ls -l, indicating that they are char nodes. Block devices appear in /dev as well, but they are identified by a ``b''; even if some of the following information applies also to block devices, I am now focusing on char drivers.
If you issue the ls command, you'll see two numbers (separated by a comma) on the device file entries before the date of last modification, where the file length normally appears. These numbers are the "major" and "minor" numbers for the particular device. The following listing shows a few devices as they appear on my system. Their major numbers are 10, 1, and 4, while the minors are 0, 3, 5, 64-65, and 128-129.
crw-rw-rw-   1 root     root      10,   3 Nov 30  1993 bmouseatixl
crw-rw-rw-   1 root     sys        1,   3 Nov 30  1993 null
crw-rw-rw-   1 root     root       4, 128 Apr 30 13:02 ptyp0
crw-rw-rw-   1 root     root       4, 129 Apr 30 13:02 ptyp1
crw-rw-rw-   1 rubini   staff      4,   0 Jan 30  1995 tty0
crw-rw-rw-   1 root     tty        4,  64 Jan 25  1995 ttyS0
crw-rw-rw-   1 root     root       4,  65 May  1 00:04 ttyS1
crw-rw-rw-   1 root     sys        1,   5 Nov 30  1993 zero
The major number identifies the driver associated with the device. For example, /dev/null and /dev/zero are both managed by driver 1, while all the tty's and pty's are managed by driver 4. The kernel uses the major number to associate the appropriate driver with its device.
The minor number is only used by the device driver; other parts of the kernel don't use it, and merely pass it it along to the driver. It isn't unusual for a driver to control several devices (as in the example above)--the minor number provides a way to differentiate among them.
Adding a new driver to the system means assigning a major number to it. The assignment should be made at driver (module) initialization by calling the following function, defined in
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
File Operations
In the next few sections, we'll look at the various operations a driver can perform on the devices it manages. The device is identified internally by a file structure, and the kernel uses the file_operations structure to access the driver's functions. This design is the first evidence we've seen of the object-oriented design of the Linux kernel. We'll see more evidence of object-oriented design later. The structure file_operations is a table of function pointers, defined in <linux/fs.h>. The structure struct file is going to be described next.
The fops pointer, which we've already seen as an argument to the register_chrdev call, points to a table of operations (open, read, and so on). Each entry in the table points to the function defined by the driver to handle the requested operation. The table can contain NULL pointers for the operations you don't support. The exact behavior of the kernel when a NULL pointer is specified is different for each function, as the list in the next section shows.
The file_operations structure has been slowly getting bigger as new functionality is added to the kernel (although no new fields were added between 1.2.0 and 2.0.x). There should be no side effects from this increase, because the C compiler takes care of any size mismatch by zero-filling uninitialized fields in global or static struct variables. New items are added at the end of the structure, so a NULL value inserted at compile time will select the default behavior (remember that the module needs to be recompiled in any case for each new kernel version it will be loaded into).
A few of the function prototypes associated to fops fields, actually changed slightly during 2.1 development. These differences are covered in Section 17.2, in Chapter 17.
The following list introduces all the operations that an application can invoke on a device. These operations are often called ``methods,'' using object-oriented programming terminology to denote actions declared by an object to act on itself.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
The file Structure
struct file, defined in <linux/fs.h>, is the second most important data structure used in device drivers. Note that a file has nothing to do with the FILEs of user-space programs. A FILE is defined in the C library and never appears in kernel code. A struct file, on the other hand, is a kernel structure that never appears in user programs.
The file structure represents an ``open file.'' It is created by the kernel on open and is passed to any function that operates on the file, until close. After the file is closed, the kernel releases the data structure. An ``open file'' is different from a ``disk file,'' which is represented by struct inode.
In the kernel sources, a pointer to struct file is usually called either file or filp (``file pointer''). I'll consistently call the pointer filp to prevent ambiguities with the structure itself--filp is a pointer (as such, it is one of the arguments to device methods), while file is the structure itself.
The most important fields of struct file are shown below. As in the previous section, the list can be skipped on a first reading. In the next section though, when we face some real C code, I'll discuss some of the fields, so they are here for you to refer back to.
mode_t f_mode;
The file mode is identified by the bits FMODE_READ and FMODE_WRITE. You might want to check this field for read/write permission in your ioctl function, but you don't need to check permissions for read and write because the kernel checks before invoking your driver. An attempt to write without permission, for example, is rejected without the driver even knowing about it.
loff_t f_pos;
The current reading or writing position.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Open and Release
Now that we've taken a quick look at the fields, we'll start using them in real scull functions.
The open method is provided for a driver to do any initialization in preparation for later operations. In addition, open usually increments the usage count for the device, so the module won't be unloaded before the file is closed.
In most drivers, open does the following:
  • Checks for device-specific errors (like device-not-ready or similar hardware problems).
  • Initializes the device, if it is being opened for the first time.
  • Identifies the minor number and updates the f_op pointer, if necessary.
  • Allocates and fills any data structure to be put in filp->private_data.
  • Increments the usage count.
In scull, most of the preceding tasks depend on the minor number of the device being opened. Therefore, the first thing to do is identify which device is involved. We can do that by looking at inode->i_rdev.
We've already talked about how the kernel doesn't use the minor number of the device, so the driver is free to use it at will. In practice, different minor numbers are used to access different devices, or to open the same device in a different way. For example, /dev/ttyS0 and /dev/ttyS1 refer to different serial ports, whereas /dev/cua0 is the same physical device as /dev/ttyS0, but it acts differently. cuas are ``callout'' devices; they aren't ttys, and they don't get all the software support that is needed for terminals (i.e., they don't have a line discipline attached). All the serial devices feature different device numbers, so the driver can tell them apart: a ttyS is different from a cua.
A driver never actually knows the name of the device being opened, just the device number—and users can play on this indifference to names by aliasing new names to a single device for their own convenience. If you look in your
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Scull's Memory Usage
Before introducing the read and write operations, we'd better look at how and why scull performs memory allocation. ``How'' is needed to thoroughly understand the code, and ``why'' demonstrates the kind of choices a driver writer needs to make, although scull is definitely not typical as a device.
This section deals only with the memory allocation policy in scull and doesn't show the hardware management skills you'll need to write real drivers. Those skills are introduced in Chapter 8, and in Chapter 9. Therefore, you can skip this section if you're not interested in understanding the inner workings of the memory-oriented scull driver.
The region of memory used by scull, which is also called a ``device'' here, is variable in length. The more you write, the more it grows; trimming is performed by overwriting the device with a shorter file.
The implementation chosen for scull is not a smart one. The source code for a smart implementation would be more difficult to read, and the aim of this section is to show read and write, not memory management. That's why the code only uses kmalloc and kfree, without resorting to allocation of whole pages, although that would be more efficient.
On the flip side, I didn't want to limit the size of the ``device'' area, for both a philosophical reason and a practical one. Philosophically, it's always a bad idea to put arbitrary limits on data items being managed. Practically, scull can be used to temporarily eat up your system's memory in order to run tests under low-memory conditions. Running such tests might help you understand the system's internals. You can use the command cp /dev/zero /dev/scull0 to eat all the real RAM with scull, and you can use the dd utility to choose how much data is copied to the scull device.
In scull, each device is a linked list of pointers, each of which points to Scull_Dev. Each such structure can refer to at most four million bytes, through an array of intermediate pointers. The released source uses an array of 1000 pointers to areas of 4000 bytes. I call each memory area a ``quantum'' and the array (or its length) a ``quantum set.'' A
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Read and Write
Reading and writing a scull device means transferring data between the kernel address space and the user address space. The operation cannot be carried out through pointers in the usual way, or through memcpy, because pointers operate in the current address space, and the driver's code is executing in kernel space, while the data buffers are in user space.
If the target device is an expansion board instead of RAM, the same problem arises, because the driver must nonetheless copy data between user buffers and kernel space. In fact, the role of a device driver is mainly managing data transfers between devices (kernel space) and applications (user space).
Cross-space copy is performed in Linux by special functions, which are defined in <asm/segment.h>. The functions devoted to performing such a copy are optimized for different data sizes (char, short, int, long); most of them will be introduced in Section 5.1.4 in Chapter 5.
Driver code for read and write in scull needs to copy a whole segment of data to or from the user address space. This capability is offered by the following functions, which copy an arbitrary array of bytes:
void memcpy_fromfs(void *to, const void *from, unsigned long count);
void memcpy_tofs(void *to, const void *from, unsigned long count);
The names of the functions date back to the first Linux versions, when the only supported architecture was the i386 and there was a lot of assembler code peeking through the C. On Intel platforms, Linux addresses user space through the FS segment register, and the two functions have kept the old name through Linux 2.0. Things did change with Linux 2.1, but 2.0 is the main target of this book. See Section 17.3 in Chapter 17 for details.
Although the functions introduced above look like normal memcpy functions, a little extra care must be used when accessing user space from kernel code; the user pages being addressed might not be currently present in memory, and the page-fault handler can put the process to sleep while the page is being transferred into place. This happens, for example, when the page must be retrieved from swap space. The net result for the driver writer is that any function that accesses user space must be reentrant and must be able to execute concurrently with other driver functions. That's why the
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Playing with the New Devices
Once you are equipped with the four methods just described, the driver can be compiled and tested; it retains any data you write to it until you overwrite it with new data. The device acts like a data buffer whose length is limited only by the amount of real RAM available. You can try using cp, dd, and input/output redirection to test out the driver.
The free command can be used to see how the amount of free memory shrinks and expands according to how much data is written into scull.
To get more confident with reading and writing one quantum at a time, you can add a printk at an appropriate point in the driver and watch what happens while an application reads or writes large chunks of data. Alternatively, use the strace utility to monitor the system calls issued by a program, together with their return values. Tracing a cp or an ls -l > /dev/scull0 will show quantized reads and writes. Monitoring (and debugging) techniques are presented to some detail in the next chapter.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Quick Reference
This chapter introduced the following symbols and header files. The list of the fields in struct file_operations and struct file is not repeated here.
#include <linux/fs.h>
The ``File System'' header is the header required for writing device drivers. All the important functions are declared in her