Version Control in Modules

One of the main problems with modules is their version dependency, which was introduced in Chapter 2. The need to recompile the module against the headers of each kernel version being used can become a real pain when you run several custom modules, and recompiling is not even possible if you run a commercial module distributed in binary form.

Fortunately, the kernel developers found a flexible way to deal with version problems. The idea is that a module is incompatible with a different kernel version only if the software interface offered by the kernel has changed. The software interface, then, can be represented by a function prototype and the exact definition of all the data structures involved in the function call. Finally, a CRC algorithm[45] can be used to map all the information about the software interface to a single 32-bit number.

The issue of version dependencies is thus handled by mangling the name of each symbol exported by the kernel to include the checksum of all the information related to that symbol. This information is obtained by parsing the header files and extracting the information from them. This facility is optional and can be enabled at compilation time. Modular kernels shipped by Linux distributors usually have versioning support enabled.

For example, the symbol printk is exported to modules as something like printk_R12345678 when version support is enabled, where 12345678 is the hexadecimal representation of the checksum of the software interface used by the function. When a module is loaded into the kernel, insmod (or modprobe) can accomplish its task only if the checksum added to each symbol in the kernel matches the one added to the same symbol in the module.

There are some limitations to this scheme. A common source of surprises has been loading a module compiled for SMP systems into a uniprocessor kernel, or vice versa. Because numerous inline functions (e.g., spinlock operations) and symbols are defined differently for SMP kernels, it is important that modules and the kernel agree on whether they are built for SMP. Version 2.4 and recent 2.2 kernels throw an extra smp_ string onto each symbol when compiling for SMP to catch this particular case. There are still potential traps, however. Modules and the kernel can differ in which version of the compiler was used to build them, which view of memory they take, which version of the processor they were built for, and more. The version support scheme can catch the most common problems, but it still pays to be careful.

But let’s see what happens in both the kernel and the module when version support is enabled:

  • In the kernel itself, the symbol is not modified. The linking process happens in the usual way, and the symbol table of the vmlinux file looks the same as before.

  • The public symbol table is built using the versioned names, and this is what appears in /proc/ksyms.

  • The module must be compiled using the mangled names, which appear in the object files as undefined symbols.

  • The loading program (insmod) matches the undefined symbols in the module with the public symbols in the kernel, thus using the version information.

Note that the kernel and the module must both agree on whether versioning is in use. If one is built for versioned symbols and the other isn’t, insmod will refuse to load the module.

Using Version Support in Modules

Driver writers must add some explicit support if their modules are to work with versioning. Version control can be inserted in one of two places: in the makefile or in the source itself. Since the documentation of the modutils package describes how to do it in the makefile, we’ll show you how to do it in the C source. The master module used to demonstrate how kmod works is able to support versioned symbols. The capability is automatically enabled if the kernel used to compile the module exploits version support.

The main facility used to mangle symbol names is the header <linux/modversions.h>, which includes preprocessor definitions for all the public kernel symbols. This file is generated as part of the kernel compilation (actually, “make depend”) process; if your kernel has never been built, or is built without version support, there will be little of interest inside. <linux/modversions.h> must be included before any other header file, so place it first if you put it directly in your driver source. The usual technique, however, is to tell gcc to prepend the file with a compilation command like:

gcc -DMODVERSIONS -include /usr/src/linux/include/linux/modversions.h...

After the header is included, whenever the module uses a kernel symbol, the compiler sees the mangled version.

To enable versioning in the module if it has been enabled in the kernel, we must make sure that CONFIG_MODVERSIONS has been defined in <linux/config.h>. That header controls what features are enabled (compiled) in the current kernel. Each CONFIG_ macro defined states that the corresponding option is active.[46]

The initial part of master.c, therefore, consists of the following lines:

#include <linux/config.h> /* retrieve the CONFIG_* macros */
#if defined(CONFIG_MODVERSIONS) && !defined(MODVERSIONS)
#  define MODVERSIONS /* force it on */
#endif

#ifdef MODVERSIONS
#  include <linux/modversions.h>
#endif

When compiling the file against a versioned kernel, the symbol table in the object file refers to versioned symbols, which match the ones exported by the kernel itself. The following screendump shows the symbol names stored in master.o. In the output of nm, T means “text,” D means “data,” and U means “undefined.” The “undefined” tag denotes symbols that the object file references but doesn’t declare.

00000034 T cleanup_module
00000000 t gcc2_compiled.
00000000 T init_module
00000034 T master_cleanup_module
00000000 T master_init_module
         U printk_Rsmp_1b7d4074
         U request_module_Rsmp_27e4dc04
morgana% fgrep 'printk' /proc/ksyms
c011b8b0 printk_Rsmp_1b7d4074

Because the checksums added to the symbol names in master.o are derived from the entire prototypes of printk and request_module, the module is compatible with a wide range of kernel versions. If, however, the data structures related to either function get modified, insmod will refuse to load the module because of its incompatibility with the kernel.

Exporting Versioned Symbols

The one thing not covered by the previous discussion is what happens when a module exports symbols to be used by other modules. If we rely on version information to achieve module portability, we’d like to be able to add a CRC code to our own symbols. This subject is slightly trickier than just linking to the kernel, because we need to export the mangled symbol name to other modules; we need a way to build the checksums.

The task of parsing the header files and building the checksums is performed by genksyms, a tool released with the modutils package. This program receives the output of the C preprocessor on its own standard input and prints a new header file on standard output. The output file defines the checksummed version of each symbol exported by the original source file. The output of genksyms is usually saved with a .ver suffix; it is a good idea to stay consistent with this practice.

To show you how symbols are exported, we have created two dummy modules called export.c and import.c. export exports a simple function called export_function, which is used by the second module, import.c. This function receives two integer arguments and returns their sum—we are not interested in the function, but rather in the linking process.

The makefile in the misc-modules directory has a rule to build an export.ver file from export.c, so that the checksummed symbol for export_function can be used by the import module:

ifdef CONFIG_MODVERSIONS
export.o import.o: export.ver
endif

export.ver: export.c
	$(CC) -I$(INCLUDEDIR) $(CFLAGS) -E -D__GENKSYMS__ $^ | \
		$(GENKSYMS) -k 2.4.0 > $@

These lines demonstrate how to build export.ver and add it to the dependencies of both object files, but only if MODVERSIONS is defined. A few lines added to Makefile take care of defining MODVERSIONS if version support is enabled in the kernel, but they are not worth showing here. The -k option must be used to tell genksyms which version of the kernel you are working with. Its purpose is to determine the format of the output file; it need not match the kernel you are using exactly.

One thing that is worth showing, however, is the definition of the GKSMP symbol. As mentioned above, a prefix (-p smp_) is added to every checksum if the kernel is built for SMP systems. The genksyms utility does not add this prefix itself; it must be told explicitly to do so. The following makefile code will cause the prefix to be set appropriately:

ifdef CONFIG_SMP
	GENKSYMS += -p smp_
endif

The source file, then, must declare the right preprocessor symbols for every conceivable preprocessor pass: the input to genksyms and the actual compilation, both with version support enabled and with it disabled. Moreover, export.c should be able to autodetect version support in the kernel, as master.c does. The following lines show you how to do this successfully:

#include <linux/config.h> /* retrieve the CONFIG_* macros */
#if defined(CONFIG_MODVERSIONS) && !defined(MODVERSIONS)
#   define MODVERSIONS
#endif

/*
 * Include the versioned definitions for both kernel symbols and our
 * symbol, *unless* we are generating checksums (__GENKSYMS__
 * defined) */
#if defined(MODVERSIONS) && !defined(__GENKSYMS__)
#    include <linux/modversions.h>
#    include "export.ver" /* redefine "export_function" to include CRC */
#endif

The code, though hairy, has the advantage of leaving the makefile in a clean state. Passing the correct flags from make, on the other hand, involves writing long command lines for the various cases, which we won’t do here.

The simple import module calls export_function by passing the numbers 2 and 2 as arguments; the expected result is therefore 4. The following example shows that import actually links to the versioned symbol of export and calls the function. The versioned symbol appears in /proc/ksyms.

morgana.root#insmod ./export.o
morgana.root# grep export /proc/ksyms
c883605c export_function_Rsmp_888cb211  [export]
morgana.root# insmod ./import.o
import: my mate tells that 2+2 = 4
morgana.root# cat /proc/modules
import                   312   0  (unused)
export                   620   0  [import]


[45] CRC means “cyclic redundancy check,” a way of generating a short, unique number from an arbitrary amount of data.

[46] The CONFIG_ macros are defined in <linux/autoconf.h>. You should, however, include <linux/config.h> instead, because the latter is protected from double inclusion, and sources <linux/autoconf.h> internally.

Get Linux Device Drivers, Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.