Kernel Timers

The ultimate resources for time keeping in the kernel are the timers. Timers are used to schedule execution of a function (a timer handler) at a particular time in the future. They thus work differently from task queues and tasklets in that you can specify when in the future your function will be called, whereas you can’t tell exactly when a queued task will be executed. On the other hand, kernel timers are similar to task queues in that a function registered in a kernel timer is executed only once—timers aren’t cyclic.

There are times when you need to execute operations detached from any process’s context, like turning off the floppy motor or finishing another lengthy shutdown operation. In that case, delaying the return from close wouldn’t be fair to the application program. Using a task queue would be wasteful, because a queued task must continually reregister itself until the requisite time has passed.

A timer is much easier to use. You register your function once, and the kernel calls it once when the timer expires. Such a functionality is used often within the kernel proper, but it is sometimes needed by the drivers as well, as in the example of the floppy motor.

The kernel timers are organized in a doubly linked list. This means that you can create as many timers as you want. A timer is characterized by its timeout value (in jiffies) and the function to be called when the timer expires. The timer handler receives an argument, which is stored in the data structure, together with a pointer to the handler itself.

The data structure of a timer looks like the following, which is extracted from <linux/timer.h>):

 struct timer_list {
     struct timer_list *next;          /* never touch this */
     struct timer_list *prev;          /* never touch this */
     unsigned long expires;            /* the timeout, in jiffies */
     unsigned long data;               /* argument to the handler */
     void (*function)(unsigned long);  /* handler of the timeout */
     volatile int running;             /* added in 2.4; don't touch */
 };

The timeout of a timer is a value in jiffies. Thus, timer->function will run when jiffies is equal to or greater than timer->expires. The timeout is an absolute value; it is usually generated by taking the current value of jiffies and adding the amount of the desired delay.

Once a timer_list structure is initialized, add_timer inserts it into a sorted list, which is then polled more or less 100 times per second. Even systems (such as the Alpha) that run with a higher clock interrupt frequency do not check the timer list more often than that; the added timer resolution would not justify the cost of the extra passes through the list.

These are the functions used to act on timers:

void init_timer(struct timer_list *timer);

This inline function is used to initialize the timer structure. Currently, it zeros the prev and next pointers (and the running flag on SMP systems). Programmers are strongly urged to use this function to initialize a timer and to never explicitly touch the pointers in the structure, in order to be forward compatible.

void add_timer(struct timer_list *timer);

This function inserts a timer into the global list of active timers.

int mod_timer(struct timer_list *timer, unsigned long expires);

Should you need to change the time at which a timer expires, mod_timer can be used. After the call, the new expires value will be used.

int del_timer(struct timer_list *timer);

If a timer needs to be removed from the list before it expires, del_timer should be called. When a timer expires, on the other hand, it is automatically removed from the list.

int del_timer_sync(struct timer_list *timer);

This function works like del_timer, but it also guarantees that, when it returns, the timer function is not running on any CPU. del_timer_sync is used to avoid race conditions when a timer function is running at unexpected times; it should be used in most situations. The caller of del_timer_sync must ensure that the timer function will not use add_timer to add itself again.

An example of timer usage can be seen in the jiq module. The file /proc/jitimer uses a timer to generate two data lines; it uses the same printing function as the task queue examples do. The first data line is generated from the read call (invoked by the user process looking at /proc/jitimer), while the second line is printed by the timer function after one second has elapsed.

The code for /proc/jitimer is as follows:

struct timer_list jiq_timer;

void jiq_timedout(unsigned long ptr)
{
    jiq_print((void *)ptr);            /* print a line */
    wake_up_interruptible(&jiq_wait);  /* awaken the process */
}


int jiq_read_run_timer(char *buf, char **start, off_t offset,
                   int len, int *eof, void *data)
{

    jiq_data.len = 0;      /* prepare the argument for jiq_print() */
    jiq_data.buf = buf;
    jiq_data.jiffies = jiffies;
    jiq_data.queue = NULL;      /* don't requeue */

    init_timer(&jiq_timer);              /* init the timer structure */
    jiq_timer.function = jiq_timedout;
    jiq_timer.data = (unsigned long)&jiq_data;
    jiq_timer.expires = jiffies + HZ; /* one second */

    jiq_print(&jiq_data);   /* print and go to sleep */
    add_timer(&jiq_timer);
    interruptible_sleep_on(&jiq_wait);
    del_timer_sync(&jiq_timer);  /* in case a signal woke us up */
    
    *eof = 1;
    return jiq_data.len;
}

Running head /proc/jitimer gives the following output:

    time  delta interrupt  pid cpu command
 45584582   0        0    8920   0 head
 45584682 100        1       0   1 swapper

From the output you can see that the timer function, which printed the last line here, was running in interrupt mode.

What can appear strange when using timers is that the timer expires at just the right time, even if the processor is executing in a system call. We suggested earlier that when a process is running in kernel space, it won’t be scheduled away; the clock tick, however, is special, and it does all of its tasks independent of the current process. You can try to look at what happens when you read /proc/jitbusy in the background and /proc/jitimer in the foreground. Although the system appears to be locked solid by the busy-waiting system call, both the timer queue and the kernel timers continue running.

Thus, timers can be another source of race conditions, even on uniprocessor systems. Any data structures accessed by the timer function should be protected from concurrent access, either by being atomic types (discussed in Chapter 10) or by using spinlocks.

One must also be very careful to avoid race conditions with timer deletion. Consider a situation in which a module’s timer function is run on one processor while a related event (a file is closed or the module is removed) happens on another. The result could be the timer function expecting a situation that is no longer valid, resulting in a system crash. To avoid this kind of race, your module should use del_timer_sync instead of del_timer. If the timer function can restart the timer itself (a common pattern), you should also have a “stop timer” flag that you set before calling del_timer_sync. The timer function should then check that flag and not reschedule itself with add_timer if the flag has been set.

Another pattern that can cause race conditions is modifying timers by deleting them with del_timer, then creating a new one with add_timer. It is better, in this situation, to simply use mod_timer to make the necessary change.

Get Linux Device Drivers, Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.