Linux Device Drivers, 2nd Edition by Alessandro Rubini & Jonathan Corbet Unconfirmed error reports are from readers. They have not yet been approved or disproved by the author or editor and represent solely the opinion of the reader. This page was updated June 9, 2004. Here's a key to the markup: [page-number]: serious technical mistake {page-number}: minor technical mistake : important language/formatting problem (page-number): language change or minor formatting problem ?page-number?: reader question or request for clarification UNCONFIRMED errors and comments from readers: (xvii) PDF web (URL) address; The links on the given PDF web address www.oreilly.com/catalog/linuxdrive2/chapter/bookindexpdf.html do not work. However, the links on the web address www.oreilly.com/catalog/linuxdrive2/chapter/book/bookindexpdf.html work fine {15} in the code section; for printk to be recognized, you also need to do: #include [15] 5th paragraph; The very first code example "hello world" does not compile on my 2.4.19 Linux system following the sample instructions in the book. If you do a Google search you will see many others running into the same problem, often commenting that they got the code from this book. http://www.google.com/search?q=parse+error+before+atomic_t I have not yet figured out what the problem is. I cut-n-pasted the code directly from the book and typed in the simple "gcc -c hello.c" exactly as shown and this is the result. I'll submit another errata if I find a solution. gcc -c hello.c In file included from hello.c:4: /usr/include/linux/module.h:60: parse error before `atomic_t' /usr/include/linux/module.h:60: warning: no semicolon at end of struct or union /usr/include/linux/module.h:60: warning: no semicolon at end of struct or union /usr/include/linux/module.h:62: parse error before `}' /usr/include/linux/module.h:62: warning: data definition has no type or storage class /usr/include/linux/module.h:91: parse error before `}' make: *** [hello] Error 1 [16] Example; First my solution: compile with: gcc -c -I /usr/src/linux-2.4.18-3/include/ hello.c When the directory is not included my gcc uses files from /usr/include/ that don't work. I am running redhat 7.3 that has not been updated. It has kernel 2.4.18-3. The wrong module.h does not include so hello.c and hello1.c and hello2.c from the misc samples fail to compile. Adding gets them to compile but they don't load apparently because the wrong version.h sets the version to 2.4.9-9. I had recompiled my kernel a number of times and tried to produce a non- versioned kernel. It failed after booting because it couldn't load fs modules because it was apparently trying to load the original version. I couldn't find anything at www.linux.it/kerneldocs/kconf. I compiled a new kernel with versioning thus ending up with a working kernel and matching header files. hello.c works fine just as printed in the Book. (without any extra versioning stuff) [21] 4rt paragraph; The problem i found it's not an error in my opinion, but something that is missing to implement the examples correctly. To use the example: printk("The process is \"%s\" (pid %i) \n" , current->comm , current->pid); I had to include the header file and to the example work. Without it, the module would generate a Oops error. The kernel which i am using is the 2.2.16 (Slackware 7.1). {32} in the first if() of the init_module function; if(!item2 || !item2) should read: if(!item1 || !item2) (55) 4th paragraph; There is a reference to the device "skull3" which should read "scull3". [62] 2nd paragraph; "Nowadays, more than 256 minor numbers are needed at times," Should it here be not "minor", but "major"? {64} 2nd paragraph; Description of llseek() says that if no llseek() is specified for a device, the default will fail seeks relative to the end of the file. My reading of default_llseek() in read_write.c indicates that while such a seek *may* fail, it may not. Instead, a failure is dependant on the resulting offset, not to what the seek is relative. (In fact, in llseek_default(), this is the same for all types of seek.) (65) 6th paragraph; The type for the fsync function pointer has incorrect arguments. The book shows the first argument being of type "struct inode *" instead of "struct file *". linux/fs.h shows this type: int (*fsync) (struct file *, struct dentry *, int datasync); [70] Book: Pg 70-71 + Source Code: ldd2-samples-1.0.1/scull/main.c, lines 765-777; /* ldd2-samples-1.0.1/scull/main.c, lines 765-777 */ for(i=0; i < scull_nr_devs; i++ ) { scull_devices[i].quantum = scull_quantum; scull_devices[i].qset = scull_qset; sema_init( &scull_devices[i].sem, 1 ); #ifdef CONFIG_DEVFS_FS sprintf( devname, "%i", i ); devfs_register( scull_devfs_dir, devname, DEVFS_FL_AUTO_DEVNUM, 0, 0, S_IFCHR | S_IRUGO | S_IWUGO, &scull_fops, // <<<**** should this be scull_fop_array[i] ? scull_devices+i); #endif } QUESTION : Should the '&scull_fops' parameter above be 'scull_fop_array[i]' instead ? This is gathered from reading the scull_open() func in pgs 70-71 of the book where scull_open() does not reassign filp->f_op if devfs is used (filp->private_data is valid). [83] 6th paragraph; the scull_write code uses memset to initialize dptr->data, then, at page 88, 6th paragraph, you use sprintf for creating devnames. Can I use standard calls into my kernel modules?? You write at page 17, 4th patagraph "because no library is linked to modules, source files should never include the usual headern files" (84) Inside "struct iovec"; _ _kernel_size_t iov_len; Should read as __kernel_size_t iov_len; {90} Middle of page, 6th line of first code fragment; In demonstrating how to use both devfs registration and non-devfs registration, the line result = devfs_register_chrdev(...) should probably read result = register_chrdev(...) since it's trying to register the device _without_ using devfs. (98) End of 1st line; "__LINE_&_" should be "__LINE__" (113) 2nd paragraph under "Using klogd"; "note the decoded symbols on the EIP line and in the stack trace" should be "note the decoded symbols on the EIP line and in the call trace" (129) 1st sentence, 2nd paragraph; Sentence that reads "The prototype stands out in the list of Unix system calls because of the dots, which usually represent not a variable number of arguments." The "not" should be dropped. {136} 2nd par. description of (__)put_user; The 8 byte version of the get/put_user macros was introduced somewhere in the 2.4.x line. I just noticed, that in 2.4.1 it is missing and in 2.4.13 it is present. The definitions in sysdep.h and the text in the book do not reflect this. {145} Figure 5-1; In the upper left-hand box, wait_queue_head_t, there is no space between "struct" and "list_head". (163) 4th paragraph; Maybe the case 2 should be "newpos = dev->size - off;" instead of "newpos = dev->size + off;" [178] 3rd block up from bottom; The description of the function interruptible_sleep_on_timeout lists the return type as void. It is, in fact, long. See linux/sched.h {178} bottom of page; Old prototypes (the new versions are on p. 143) for the following functions: wake_up wake_up_interruptible wake_up_sync wake_up_interruptible_sync These are all listed as taking a (struct wait_queue **), but the new versions take a (wait_queue_head_t *). {183} #include cycles_t get_cycles(void) That function prototype is now in the file asm/timex.h. So, it should say #include static inline cycles_t get_cycles (void) [197] End of 1st parag. under "The immediate queue"; "otherwise the kernel may run the task queue before your task has been added." -doesn4t make sense! This cannot be a correct statement if what is said about The Immediate Queue is correct here and on pages 272-273. (203) 3rd paragraph; The paragraph says "... either by being atomic types (discussed in Chapter 10) or by ..." Wheras atomic data types and spinlocks are explained in Chapter 9! {212} 3rd paragraph; The last sentence in the paragraph states that : " The maximum length for the name is 20 characters, including the trailing terminator." Although the "name" field is defined in "slap.c" to be of size CACHE_NAMELEN(20). The following check is done in the kmem_cache_create() code (the code is extracted from the 2.4.18 slap.c) 632 if ((!name) || 633 ((strlen(name) >= CACHE_NAMELEN - 1)) || /* other checks */ 639 BUG(); The correction is that the maximum length for the name is 19 characters including the trailing terminator. (224) Last sentence.; "and" should be "or". (227) footnote; 1] Not all computer platform use a read and a write signal; some have... should read: 1] Not all computer platforms use a read and a write signal; some have... [230] associated example source code; The AXP version of GCC has no perm.h, consequently two files in the misc-progs subdirectory must be modified for them to compile on an Alpha-based platform. Both inp.c and outp.c require the same modification: Rearrange the ifdef's so that: #ifdef __GLIBC__ ... #endif #ifdef __i386__ becomes: #ifdef __i386__ #ifdef __GLIBC__ ... #endif (237) 4th paragraph; original: ..., you'll generate your own input to be read from the input ports. I think this should be: ..., you'll generate your own input to be read from the output ports. [238] 2nd paragraph; echo -n "any string" > /dev/short0 issued at the shell prompt only creates a file 'short0' in the '/dev' catalog with the meaningless content "any string". The qustion remains - How doe one write to the parallel port? Should a file with the mentioned while-loop be written? In what way? {250} near the top of the page; request_region() and request_mem_region() are listed as returning "void", but on page 52 they are listed as returning "struct resource *". {250} Bottom of page "readb", "readw", etc. functions are found in NOT . (257) intr line from /proc/stat; Seems that in addition to truncating and breaking the line, the counts have been truncated to the most significant 6 digits. [271] 2nd paragraph; The shared variable short_bh_count is not correctly maintained, even with uniprocessor kernels. For example, if short_do_tasklet has just cached the previous short_bh_count in savecount and short_tl_interrupt runs, the corresponding increment of short_bh_count can be lost when short_do_tasklet resets it, even though the fact of (at least one) interrupt occurring will be detected when the rescheduled bottom half runs again later. I believe this can be fixed by using a spinlock in the manner described on page 282. Although in this case short_bh_count isn't terribly important, some interrupt handlers need to know *exactly* how many times they've been invoked, so correcting the example would help avoid subtle driver problems when people copy it. I have not tried to see if there's some reason why tv_head in this particular example should also be protected. (If in fact this is *not* a problem, a footnote explaining why it can't happen would be nice.) spinlock_t short_bh_count_lock = SPIN_LOCK_UNLOCKED; void short_tl_interrupt(int irq, void *dev_id, struct pt_regs *regs) { do_gettimeofday((struct timeval *) tv_head); /* cast to stop 'volatile' warning */ short_incr_tv(&tv_head); tasklet_schedule(&short_tasklet); /* Protect interrupt count integrity if bottom-half is active */ spin_lock(&short_bh_count_lock); short_bh_count++; /* record that an interrupt arrived */ spin_unlock(&short_bh_count_lock); } /* Same for short_bh_interrupt */ void short_do_tasklet (unsigned long unused) { int savecount, written; unsigned long flags; /* Block interrupts and hold lock while we read and reset the * interrupt counter. */ spin_lock_irqsave(&short_bh_count_lock, flags); savecount = short_bh_count; short_bh_count = 0; /* we have already been removed from the queue */ spin_unlock_irqrestore(&short_bh_count_lock, flags); ... {276} 4th paragraph; I think it's not "release_irq" but "free_irq". (282) middle of page; spin_unlock_irqsave should be spin_unlock_irqrestore [286] moddle of page; A function "atomic_add_and_test" is described. This function does not exist. Or, at least, it is not present in of 2.4.18-rc1. Also, that block of atomic functions are described as returning "the previous value of the atomic data type". This could arguably be what they *should* return, however what they *actually* return is: "true if the result is 0, or false for all other cases." [286] middle; Functions atomic_add_and_test() and atomic_dec_and_test() do not exist. (as of linux-2.4.20). The description of functions atomic_inc_and_test() and atomic_dec_and_test() is wrong. As written, it states that these functions: "These functions behave like their counterparts listed earlier, but they also return the previous value of the atomic data type." This is wrong. In addition, its not immediately obvious what functions "counterparts" refers to. Since it requires more text to point to the previous functions, better would be to just repeat it. I recommend the quoted text above be changed to: "Increment or decrement an atomic variable. Return 1 if the new value is 0, otherwise return 0." {299} 2nd; In Chap. 10 Judicious Data Types --> Other Portability Issues --> Byte Order: In page 298 you intorduce a comfortable simplification: u32 __cpu_to_le32 (u32); u32 __le32_to_cpu (u32); That is not the problem. In p.299 2nd pragraph, however you introduce: __le16_to_cpus (mind the s) you imply that it as the prototype of: s16 __le16_to_cpus (s16); However include/lunux/byteorder/little_endian.h #define __le16_to_cpus(x) do {} while (0) i.e. somthing that RETURNS NO VALUE. I can't figure it out. I googled for it. I followed the code as far as I could. I palyed with it like so: ------------------------------------- #include #include #include int main( int argc, char** argv) { __s32 i= 0x12345678 ; printf( "befire: i= %x\n", i) ; i= __cpu_to_be32s( i) ; printf( "after: i= %x\n", i) ; return 0 ; } ------------------------------------ But got a only confusing results. (302) Last line of todo_add_entry(); the line list_add_tail(&new->list, &todo_struct) should be: list_add_tail(&new->list, &todo_list) {302} code of todo_add_entry; (see below for the entire code sample) The last line of the example does not make sense to me: list_add_tail(&new->list, &todo_struct) As far as I can see, todo_struct is a struct, not a variable. The address of a struct? I do not see how this code could ever compile. void todo_add_entry(struct todo_struct *new) { struct list_head *ptr; struct todo_struct *entry; for (ptr = todo_list.next; ptr != &todo_list; ptr = ptr->next) { entry = list_entry(ptr, struct todo_struct, list); if (entry->priority < new->priority) { list_add_tail(&new->list, ptr); return; } } list_add_tail(&new->list, &todo_struct) } (315) Last paragraph; "The main facility to mangle..." probably should read: "The main facility to manage..." {355,361} p355, 1st para, p361 1st para; Refererences to "close" method are inappropriate in the context of 2.4 block drivers. They should be replaced by "release". {366} 1st paragraph; The structure `struct genhd' does not exist. It is probably `struct gendisk' in . {367} last paragraph; The item "spinlock_t io_request_lock;" and its associated description are repeated on p.368 {369} 4th line; The header file does not exist. It is probably , in which `struct gendisk' and `struct gendisk *gendisk_head' is declared. [384] 3rd paragraph; At least with kernel versions 2.4.4 and 2.4.18, the mmap(2) manual page gives the prototype for the mmap system call as void * mmap (void * start, size_t length, int prot, int flags, int fd, off_t offset) not the mmap (caddr_t addr, size_t length, int prot, int flags, int fd, off_t offset) given in the book. The distinction (and the supporting detail in the man page, which is not provided in the book) is essential to understanding Chapter 13. {385} code for simple_mmap() function; if (offset >= _ _pa(high_memory) || ...) should be: if (offset >= __pa(high_memory) || ... ) {387} 2nd paragraph on "Mapping Memory with nopage"; The nopage method prototype here is struct page (*nopage)(struct vm_area_struct *vma, ...); But nopage method should return the pointer to struct page. On page 382, there is the correct prototype: struct page *(*nopage)(struct vm_area_struct *vma, ...); {396} 4th paragraph; In at least some versions (e.g. 2.4.7) kiobuf_init() is declared static in fs/iobuf.c and is unfortunately not available for general use. (396) bottom of page; For the prototype of free_kiovec, void free_kiovec(int nr, struct kiobuf **); its first parameter has name `nr', but second parameter has no name. I feel it strange a little. {410} End of first paragraph; The include file scatterlist.h is found in and not . [422] Top of page; The functions are enumerated in order kiobuf_init, alloc_kiovec, free_kiovec, but its description are: These functions handle the allocation, initialization, and freeing ... I think it is better to be in order initialization, allocation, and freeing. [445] somewhere in "packet transmission" subsection; It is worth to clarify the following rule for 'skb' ownership: - if hard_start_xmit returns 0 (success), driver is responsible for future 'skb' deallocation; - if hard_start_xmit returns non 0 (error), core network code will deallocate 'skb'. Driver in this case shall not deallocate 'skb' (450) 2nd paragraph; Chapter is misspelled as "chpater". {478} 4th paragraph; pci_find_slot is available only on 2.4.x kernels, though it is presented here at part of a "backwards compatible interface" which supports 2.2.x. {479} Continuation of code from Pg 478; code has: return(index == 0) ? -ENODEV : 0 It should be return(found == 0) ? -ENODEV : 0 {484} pci_resource_* function descriptions; An omission: unsigned long pci_resource_len(struct pci_dev *dev, int bar); This function returns the length of the I/O region number bar. {515} 7th paragraph; The procedure `oom_killer' does not exist. It should be `oom_kill', which is defined in mm/oom_kill.c.