Using a Debugger

The last resort in debugging modules is using a debugger to step through the code, watching the value of variables and machine registers. This approach is time-consuming and should be avoided whenever possible. Nonetheless, the fine-grained perspective on the code that is achieved through a debugger is sometimes invaluable. In our context, the code being debugged runs in the kernel address space—this makes things harder, because it’s impossible to step through the kernel unless you remote-control it. I’ll describe remote control last because it’s rarely needed when writing modules. Fortunately, it is possible to look at variables in the current kernel and to modify them, even without remote control.

Proficient use of the debugger at this level requires some confidence with gdb commands, a minimal understanding of assembly code, and the ability to match source code and optimized assembly.

Unfortunately, gdb is more useful for dealing with the kernel proper than for debugging modules, and something more is needed to apply the same capabilities to modularized code. This something is the kdebug package, which uses the ``remote debugging'' interface of gdb to control the local kernel. I’ll introduce kdebug after talking about what you can do with the plain debugger.

Using gdb

gdb can be quite useful for looking at the system internals. The debugger must be invoked as though the kernel were an application. In addition to specifying the kernel’s filename, you should provide the name of a core file on the command line. A typical invocation of gdb looks like the following:

gdb /usr/src/linux/vmlinux /proc/kcore

The first argument is the name of the uncompressed kernel executable (after you compiled it in /usr/src/linux). The zImage file (sometimes called vmlinuz) only exists for the x86 architecture, and is a trick to work around the 640KB limit of real-mode Intel processors; vmlinux on the contrary is the uncompressed kernel, on whichever platform you compile your kernel.

The second argument on the gdb command line is the name of the core file. Like any file in /proc, /proc/kcore is generated when it is read. When the read system call executes in the /proc filesystem, it maps to a data-generation function rather than a data-retrieval one; we’ve already exploited this feature in Section 4.2.1. kcore is used to represent the kernel ``executable'' in the format of a core file; it is a huge file, because it represents the whole kernel address space, which corresponds to all physical memory. From within gdb, you can look at kernel variables by issuing the standard gdb commands. For example, p jiffies prints the number of clock ticks from system boot to the current time.

When you print data from gdb, the kernel is still running, and the various data items have different values at different times; gdb, however, optimizes access to the core file by caching data that has already been read. If you try to look at the jiffies variable once again, you’ll get the same answer as before. Caching values to avoid extra disk access is a correct behavior for conventional core files, but is inconvenient when a ``dynamic'' core image is used. The solution is to issue the command core-file /proc/kcore whenever you want to flush the gdb cache; the debugger prepares to use a new core file and discards any old information. You won’t, however, always need to issue core-file when reading a new datum; gdb reads the core in chunks of one kilobyte and caches only chunks it has already referenced.

What you cannot do with plain gdb is modify kernel data; the debugger won’t try to modify the core file, because it wants to run the program being debugged before accessing its memory image. When debugging a kernel image, issuing the run command results in a segmentation fault after a few instructions have executed. For this reason, /proc/kcore doesn’t even implement a write method.

If you compile the kernel with debugging support (-g), the resulting vmlinux file turns out to be a better candidate for use with gdb than the same file compiled without -g. Note, however, that a huge amount of disk space is needed to compile the kernel with the -g option--a version 2.0 kernel image with networking and a minimum set of devices and filesystems occupies more than 11 megs on the PC. Anyway, you can still make the zImage file and use it for booting: the debugging information added by -g is stripped out when the bootable image is built. If I had enough disk space, I’d always compile with -g turned on.

On non-PC computers, the game is different. On the Alpha, make boot strips the kernel before creating the bootable image, so you end up with both the vmlinux and the vmlinux.gz files. The former is useable by gdb, and you can boot from the latter. On the Sparc, the kernel (at least the 2.0 kernel) is not stripped by default, so you need to strip it yourself before passing it to silo (the Sparc loader) for booting. Neither milo (the Alpha loader) nor silo can boot an unstripped kernel, due to its size.

When you compile the kernel with -g, and you run the debugger using vmlinux together with /proc/kcore, gdb can return a lot of information about the kernel internals. You can, for example, use commands like p *module_list, p *module_list->next, and p *chrdevs[4]->fops to dump structures. This sniffing operation is most interesting if you keep a kernel map and the source code handy.

Another useful task that gdb performs on the current kernel is disassembling functions, via the disassemble command (which can be abbreviated) or the ``examine instructions'' (x/i) command. The disassemble command can take as its argument either a function name or a memory range, while x/i takes a single memory address, also in the form of a symbol name. You can invoke, for example, x/20i to disassemble 20 instructions. Note that you can’t disassemble a module function, because the debugger is acting on vmlinux, which doesn’t know about your module. If you try to disassemble a module by address, gdb is most likely to reply ``Cannot access memory at xxxx.'' For the same reason, you can’t look at data items belonging to a module. They can be read from /dev/mem if you know the address of your variables, but it’s hard to make sense out of raw data extracted from system RAM.

If you want to disassemble a module function, you’re better off running the objdump utility on the module object file. Unfortunately, the tool runs on the disk copy of the file, not the running one; therefore, the addresses as shown by objdump will be the addresses before relocation, unrelated to the module’s execution environment.

As you see, gdb is a useful tool when your aim is to peek into the running kernel, but it lacks some features, the most important being the ability to modify kernel items and to access modules. This hole is filled by the kdebug package.

Using kdebug

kdebug can be retrieved from the usual FTP sites under pcmcia/extras, but if you want to be sure to retrieve the latest version, you should look at ftp://hyper.stanford.edu/pub/pcmcia/extras/. The tool is not actually related to pcmcia, but the two packages are written by the same author.

kdebug is a small tool that uses the ``remote debugging'' interface of gdb to talk to the running kernel. A module is loaded into the system, and the debugger is fired up using /dev/kdebug to access kernel data. gdb thinks the device is a serial port that communicates with the ``application'' being debugged, but it is really only a communication channel for accessing kernel space. Because the module itself is running in kernel space, you can look at kernel-space addresses that you can’t access with the plain debugger. As you may have guessed, the module is a char driver, and it uses dynamic assignment of the major number.

The benefit of kdebug is that it doesn’t force you to patch and recompile anything: neither the kernel nor the debugger. All you need to do is compile and install the package, and then invoke kgdb, a script that performs some set-up and calls gdb using the new interface to the kernel internals.

Even kdebug, however, doesn’t provide the ability to step through kernel code or to set breakpoints. This is almost unavoidable, because the kernel must run to keep the system alive, and the only way to step through kernel code is to control the system via a serial line from another computer, as described later. The implementation of kgdb nonetheless allows the user to modify data items in the application being debugged (i.e., the current kernel), to call functions by passing them arbitrary parameters, and to access, in a read-write fashion, the address ranges occupied by modules.

That last feature is achieved by adding the module’s symbol table to the debugger’s internal table using gdb commands. This task is performed by the kgdb script. gdb then knows what address to ask for whenever the user requests access to a particular symbol. The actual access is performed by kernel code in the module. Note, however, that the current version of kdebug (1.6) has some problems in mapping symbols to addresses for modularized code. You are better off making some checks with the version you are using by printing the address of a few symbols and comparing them to /proc/ksyms. If the addresses are mismatched, you can still use numeric values and cast them to the correct type. The following is an example of such a cast:

(gdb) p (struct file_operations)(*0x02015cf0)
$16 = {lseek = 0x20152d0 <kmouse_seek>, 
read = 0x20154fc <kmouse_read_data>,
write = 0x2015738 <kmouse_write_data>, 
readdir = 0, select = 0x201585c <kmouse_select>, 
ioctl = 0x20158ec <kmouse_ioctl>,
mmap = 0, open = 0x20152dc <kmouse_open>,
release = 0x2015448 <kmouse_release>, fsync = 0,
fasync = 0x2015a8c <kmouse_fasync>, check_media_change = 0, 
revalidate = 0}

Another advantage of kdebug over plain gdb is that it permits you to read the data structures as they change, without the need to flush the debugger’s cache; the gdb command set remotecache 0 can be used to disable data caching.

I won’t show any more examples of interaction with the tool, because it works like plain gdb. Such examples would be trivial for those who know how to use the debugger and obscure for those who don’t. Becoming skilled in using a debugger takes time and experience, and I won’t undertake the role of teacher.

All in all, kdebug is a really good program to have available. Being able to easily modify data structures on the fly is a real win for a developer (and a good way to hang your computer with a single typo). There are times when the tool makes your life easier--for example, during the development of scull, I used kdebug to reset the usage count for the module to 0, after it got screwed up.[15] This saved me from the annoyance of having to reboot, log in, and start all my applications running again.

Remote Debugging

The final option for debugging a kernel image is to use the remote-debugging capabilities of gdb.

When performing remote debugging, you need two computers: one runs gdb, and the other runs the kernel you want to debug. The two computers are linked by a conventional serial line. As you might expect, the controlling gdb must be able to understand the binary format of the kernel it controls. If the computers have different architectures, the debugger must be compiled to support its target platform.

As of 2.0, the Intel port of the Linux kernel doesn’t support remote debugging, but the Alpha and the Sparc versions do. On the Alpha, you must include support for remote debugging at compile time and enable it at boot time by passing the kernel the command-line argument kgdb=1, or just kgdb. On the Sparc, support for remote debugging is always included. The boot option kgdb=tty x selects which serial line is used to control the kernel, where x is a or b. If no kgdb= option is used, the kernel boots in the normal way.

If remote debugging is enabled on the kernel, a special initialization function is called at boot time that sets up the controlled kernel to handle its own breakpoints and then jumps to a breakpoint purposely compiled into the program. This stops normal execution of the kernel and transfers control to the breakpoint-service routine. Such a handler waits to receive commands from gdb via the serial line and, when it gets one, executes it. With this setup, the programmer can single-step through the code, set breakpoints, and do all the other nifty things gdb usually allows.

On the controlling side, a copy of the target image is needed (let’s assume it’s called linux.img) as well as a copy of any module you want to debug. The following commands must be passed to gdb:

file linux.img

The file command tells gdb which binary file is being debugged. Alternatively, the image filename can be passed on the command line. The file itself must be identical to the kernel running on the other side of the link.

target remote /dev/ttyS1

This command instructs gdb to use the remote computer as the target of the debugging session. /dev/ttyS1 is the local serial port used to communicate, and you can specify any device. The kgdb script part of the kdebug package introduced above, for example, uses target remote /dev/kdebug.

add-symbol-file module.o address

If you want to debug a module that has been loaded on the controlled kernel, you need a copy of the module object on the controlling system. add-symbol-file prepares gdb to deal with the module, assuming its code has been relocated to address address.

Even though remote debugging can be used with modules, it’s quite tricky to do so, since you have to load the module and hit another breakpoint before you can insert a new breakpoint in the module itself. I personally wouldn’t use remote debugging to trace a module unless there are major problems with parts of the code that run asynchronously, like interrupt handlers.



[15] The usage count is the very first word of a module’s address space, though this fact is undocumented and could change in the future.

Get Linux Device Drivers now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.