The most important function in a block driver is the request function, which performs the low-level operations related to reading and writing data. This section discusses the basic design of the request procedure.
When the kernel schedules a data transfer, it queues the request in a list, ordered in such a way that it maximizes system performance. The queue of requests is then passed to the driver’s request function, which has the following prototype:
void request_fn(request_queue_t *queue);
The request function should perform the following tasks for each request in the queue:
Check the validity of the request. This test is performed by the macro
INIT_REQUEST
, defined inblk.h
; the test consists of looking for problems that could indicate a bug in the system’s request queue handling.Perform the actual data transfer. The
CURRENT
variable (a macro, actually) can be used to retrieve the details of the current request.CURRENT
is a pointer tostruct request
, whose fields are described in the next section.Clean up the request just processed. This operation is performed by end_request, a static function whose code resides in
blk.h
. end_request handles the management of the request queue and wakes up processes waiting on the I/O operation. It also manages theCURRENT
variable, ensuring that it points to the next unsatisfied request. The driver passes the function a single argument, which is 1 in case of success and 0 in case of failure. When end_request is called with an argument of 0, an “I/O error” message is delivered to the system logs (via printk).Loop back to the beginning, to consume the next request.
Based on the previous description, a minimal request function, which does not actually transfer any data, would look like this:
void sbull_request(request_queue_t *q) { while(1) { INIT_REQUEST; printk("<1>request %p: cmd %i sec %li (nr. %li)\n", CURRENT, CURRENT->cmd, CURRENT->sector, CURRENT->current_nr_sectors); end_request(1); /* success */ } }
Although this code does nothing but print messages, running this
function provides good insight into the basic design of data transfer.
It also demonstrates a couple of features of the macros defined in
<linux/blk.h>
. The first is that, although
the while
loop looks like it will never terminate,
the fact is that the INIT_REQUEST
macro performs a
return
when the request queue is empty. The loop
thus iterates over the queue of outstanding requests and then returns
from the request function. Second, the
CURRENT
macro always describes the request to be
processed. We get into the details of CURRENT
in
the next section.
A block driver using the request function just shown will actually work—for a short while. It is possible to make a filesystem on the device and access it for as long as the data remains in the system’s buffer cache.
This empty (but verbose) function can still be run in
sbull by defining the symbol
SBULL_EMPTY_REQUEST
at compile time. If you want
to understand how the kernel handles different block sizes, you can
experiment with blksize=
on the
insmod command line. The empty
request function shows the internal workings of
the kernel by printing the details of each request.
The request function has one very important constraint: it must be atomic. request is not usually called in direct response to user requests, and it is not running in the context of any particular process. It can be called at interrupt time, from tasklets, or from any number of other places. Thus, it must not sleep while carrying out its tasks.
To understand how to build a working request
function for sbull, let’s look at how the
kernel describes a request within a struct request
.
The structure is defined in <linux/blkdev.h>
.
By accessing the fields in the request
structure,
usually by way of CURRENT
, the driver can retrieve
all the information needed to transfer data between the buffer cache
and the physical block device.[48]
CURRENT
is just a pointer into
blk_dev[MAJOR_NR].request_queue
. The following
fields of a request hold information that is useful to the
request function:
-
kdev_t rq_dev;
The device accessed by the request. By default, the same request function is used for every device managed by the driver. A single request function deals with all the minor numbers;
rq_dev
can be used to extract the minor device being acted upon. TheCURRENT_DEV
macro is simply defined asDEVICE_NR(CURRENT->rq_dev)
.-
int cmd;
This field describes the operation to be performed; it is either
READ
(from the device) orWRITE
(to the device).-
unsigned long sector;
The number of the first sector to be transferred in this request.
-
unsigned long current_nr_sectors;
,unsigned long nr_sectors;
The number of sectors to transfer for the current request. The driver should refer to
current_nr_sectors
and ignorenr_sectors
(which is listed here just for completeness). See Section 12.4.2 later in this chapter for more detail onnr_sectors
.-
char *buffer;
The area in the buffer cache to which data should be written (
cmd==READ
) or from which data should be read (cmd==WRITE
).-
struct buffer_head *bh;
The structure describing the first buffer in the list for this request. Buffer heads are used in the management of the buffer cache; we’ll look at them in detail shortly in Section 12.4.1.1.
There are other fields in the structure, but they are primarily meant for internal use in the kernel; the driver is not expected to use them.
The implementation for the working request
function in the sbull device is shown
here. In the following code, the Sbull_Dev
serves
the same function as Scull_Dev
, introduced in
Section 3.6 in Chapter 3.
void sbull_request(request_queue_t *q) { Sbull_Dev *device; int status; while(1) { INIT_REQUEST; /* returns when queue is empty */ /* Which "device" are we using? */ device = sbull_locate_device (CURRENT); if (device == NULL) { end_request(0); continue; } /* Perform the transfer and clean up. */ spin_lock(&device->lock); status = sbull_transfer(device, CURRENT); spin_unlock(&device->lock); end_request(status); } }
This code looks little different from the empty version shown earlier;
it concerns itself with request queue management and pushes off the
real work to other functions. The first,
sbull_locate_device, looks at the device number
in the request and finds the right Sbull_Dev
structure:
static Sbull_Dev *sbull_locate_device(const struct request *req) { int devno; Sbull_Dev *device; /* Check if the minor number is in range */ devno = DEVICE_NR(req->rq_dev); if (devno >= sbull_devs) { static int count = 0; if (count++ < 5) /* print the message at most five times */ printk(KERN_WARNING "sbull: request for unknown device\n"); return NULL; } device = sbull_devices + devno; /* Pick it out of device array */ return device; }
The only “strange” feature of the function is the conditional
statement that limits it to reporting five errors. This is intended to
avoid clobbering the system logs with too many messages, since
end_request(0)
already prints an “I/O error”
message when the request fails. The static
counter
is a standard way to limit message reporting and is used several times
in the kernel.
The actual I/O of the request is handled by sbull_transfer:
static int sbull_transfer(Sbull_Dev *device, const struct request *req) { int size; u8 *ptr; ptr = device->data + req->sector * sbull_hardsect; size = req->current_nr_sectors * sbull_hardsect; /* Make sure that the transfer fits within the device. */ if (ptr + size > device->data + sbull_blksize*sbull_size) { static int count = 0; if (count++ < 5) printk(KERN_WARNING "sbull: request past end of device\n"); return 0; } /* Looks good, do the transfer. */ switch(req->cmd) { case READ: memcpy(req->buffer, ptr, size); /* from sbull to buffer */ return 1; case WRITE: memcpy(ptr, req->buffer, size); /* from buffer to sbull */ return 1; default: /* can't happen */ return 0; } }
Since sbull is just a RAM disk, its “data transfer” reduces to a memcpy call.
[48] Actually, not all blocks passed to a block driver need be in the buffer cache, but that’s a topic beyond the scope of this chapter.
Get Linux Device Drivers, Second Edition now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.