O'Reilly logo

Linux Device Drivers, Second Edition by Alessandro Rubini, Jonathan Corbet

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Other Portability Issues

In addition to data typing, there are a few other software issues to keep in mind when writing a driver if you want it to be portable across Linux platforms.

A general rule is to be suspicious of explicit constant values. Usually the code has been parameterized using preprocessor macros. This section lists the most important portability problems. Whenever you encounter other values that have been parameterized, you’ll be able to find hints in the header files and in the device drivers distributed with the official kernel.

Time Intervals

When dealing with time intervals, don’t assume that there are 100 jiffies per second. Although this is currently true for Linux-x86, not every Linux platform runs at 100 Hz (as of 2.4 you find values ranging from 20 to 1200, although 20 is only used in the IA-64 simulator). The assumption can be false even for the x86 if you play with the HZ value (as some people do), and nobody knows what will happen in future kernels. Whenever you calculate time intervals using jiffies, scale your times using HZ (the number of timer interrupts per second). For example, to check against a timeout of half a second, compare the elapsed time against HZ/2. More generally, the number of jiffies corresponding to msec milliseconds is always msec*HZ/1000. This detail had to be fixed in many network drivers when porting them to the Alpha; some of them didn’t work on that platform because they assumed HZ to be 100.

Page Size

When playing games with memory, remember that a memory page is PAGE_SIZE bytes, not 4 KB. Assuming that the page size is 4 KB and hard-coding the value is a common error among PC programmers—instead, supported platforms show page sizes from 4 KB to 64 KB, and sometimes they differ between different implementations of the same platform. The relevant macros are PAGE_SIZE and PAGE_SHIFT. The latter contains the number of bits to shift an address to get its page number. The number currently is 12 or greater, for 4 KB and bigger pages. The macros are defined in <asm/page.h>; user-space programs can use getpagesize if they ever need the information.

Let’s look at a nontrivial situation. If a driver needs 16 KB for temporary data, it shouldn’t specify an order of 2 to get_free_pages. You need a portable solution. Using an array of #ifdef conditionals may work, but it only accounts for platforms you care to list and would break on other architectures, such as one that might be supported in the future. We suggest that you use this code instead:

int order = (14 - PAGE_SHIFT > 0) ? 14 - PAGE_SHIFT : 0;
buf = get_free_pages(GFP_KERNEL, order);

The solution depends on the knowledge that 16 KB is 1<<14. The quotient of two numbers is the difference of their logarithms (orders), and both 14 and PAGE_SHIFT are orders. The value of order is calculated at compile time, and the implementation shown is a safe way to allocate memory for any power of two, independent of PAGE_SIZE.

Byte Order

Be careful not to make assumptions about byte ordering. Whereas the PC stores multibyte values low-byte first (little end first, thus little-endian), most high-level platforms work the other way (big-endian). Modern processors can operate in either mode, but most of them prefer to work in big-endian mode; support for little-endian memory access has been added to interoperate with PC data and Linux usually prefers to run in the native processor mode. Whenever possible, your code should be written such that it does not care about byte ordering in the data it manipulates. However, sometimes a driver needs to build an integer number out of single bytes or do the opposite.

You’ll need to deal with endianness when you fill in network packet headers, for example, or when you are dealing with a peripheral that operates in a specific byte ordering mode. In that case, the code should include <asm/byteorder.h> and should check whether __BIG_ENDIAN or __LITTLE_ENDIAN is defined by the header.

You could code a bunch of #ifdef __LITTLE_ENDIAN conditionals, but there is a better way. The Linux kernel defines a set of macros that handle conversions between the processor’s byte ordering and that of the data you need to store or load in a specific byte order. For example:

u32 __cpu_to_le32 (u32);
u32 __le32_to_cpu (u32);

These two macros convert a value from whatever the CPU uses to an unsigned, little-endian, 32-bit quantity and back. They work whether your CPU is big-endian or little-endian, and, for that matter, whether it is a 32-bit processor or not. They return their argument unchanged in cases where there is no work to be done. Use of these macros makes it easy to write portable code without having to use a lot of conditional compilation constructs.

There are dozens of similar routines; you can see the full list in <linux/byteorder/big_endian.h> and <linux/byteorder/little_endian.h>. After a while, the pattern is not hard to follow. __be64_to_cpu converts an unsigned, big-endian, 64-bit value to the internal CPU representation. __le16_to_cpus, instead, handles signed, little-endian, 16-bit quantities. When dealing with pointers, you can also use functions like __cpu_to_le32p, which take a pointer to the value to be converted rather than the value itself. See the include file for the rest.

Not all Linux versions defined all the macros that deal with byte ordering. In particular, the linux/byteorder directory appeared in version 2.1.72 to make order in the various <asm/byteorder.h> files and remove duplicate definitions. If you use our sysdep.h, you’ll be able to use all of the macros available in Linux 2.4 when compiling code for 2.0 or 2.2.

Data Alignment

The last problem worth considering when writing portable code is how to access unaligned data—for example, how to read a four-byte value stored at an address that isn’t a multiple of four bytes. PC users often access unaligned data items, but few architectures permit it. Most modern architectures generate an exception every time the program tries unaligned data transfers; data transfer is handled by the exception handler, with a great performance penalty. If you need to access unaligned data, you should use the following macros:

#include <asm/unaligned.h>
get_unaligned(ptr);
put_unaligned(val, ptr);

These macros are typeless and work for every data item, whether it’s one, two, four, or eight bytes long. They are defined with any kernel version.

Another issue related to alignment is portability of data structures across platforms. The same data structure (as defined in the C-language source file) can be compiled differently on different platforms. The compiler arranges structure fields to be aligned according to conventions that differ from platform to platform. At least in theory, the compiler can even reorder structure fields in order to optimize memory usage.[41]

In order to write data structures for data items that can be moved across architectures, you should always enforce natural alignment of the data items in addition to standardizing on a specific endianness. Natural alignment means storing data items at an address that is a multiple of their size (for instance, 8-byte items go in an address multiple of 8). To enforce natural alignment while preventing the compiler from moving fields around, you should use filler fields that avoid leaving holes in the data structure.

To show how alignment is enforced by the compiler, the dataalign program is distributed in the misc-progs directory of the sample code, and an equivalent kdataalign module is part of misc-modules. This is the output of the program on several platforms and the output of the module on the SPARC64:

arch  Align:  char  short  int  long   ptr long-long  u8 u16 u32 u64
i386            1     2     4     4     4     4        1   2   4   4
i686            1     2     4     4     4     4        1   2   4   4
alpha           1     2     4     8     8     8        1   2   4   8
armv4l          1     2     4     4     4     4        1   2   4   4
ia64            1     2     4     8     8     8        1   2   4   8
mips            1     2     4     4     4     8        1   2   4   8
ppc             1     2     4     4     4     8        1   2   4   8
sparc           1     2     4     4     4     8        1   2   4   8
sparc64         1     2     4     4     4     8        1   2   4   8

kernel: arch  Align: char short int long  ptr long-long u8 u16 u32 u64
kernel: sparc64        1    2    4    8    8     8       1   2   4   8

It’s interesting to note that not all platforms align 64-bit values on 64-bit boundaries, so you’ll need filler fields to enforce alignment and ensure portability.



[41] Field reordering doesn’t happen in currently supported architectures because it could break interoperability with existing code, but a new architecture may define field reordering rules for structures with holes due to alignment restrictions.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required