In addition to data typing, there are a few other software issues to keep in mind when writing a driver if you want it to be portable across Linux platforms.
A general rule is to be suspicious of explicit constant values. Usually the code has been parameterized using preprocessor macros. This section lists the most important portability problems. Whenever you encounter other values that have been parameterized, you’ll be able to find hints in the header files and in the device drivers distributed with the official kernel.
When dealing with time intervals, don’t assume that there are 100
jiffies per second. Although this is currently true for Linux-x86,
not every Linux platform runs at 100 Hz (as of 2.4 you find values
ranging from 20 to 1200, although 20 is only used in the IA-64
simulator). The assumption can be false even for the x86 if you play
with the HZ
value (as some people do), and nobody
knows what will happen in future kernels. Whenever you calculate time
intervals using jiffies, scale your times using HZ
(the number of timer interrupts per second). For example, to check
against a timeout of half a second, compare the elapsed time against
HZ/2
. More generally, the number of jiffies
corresponding to msec
milliseconds is always
msec*HZ/1000
. This detail had to be fixed in many
network drivers when porting them to the Alpha; some of them didn’t
work on that platform because they assumed HZ
to be 100.
When playing games with memory, remember that a memory page is
PAGE_SIZE
bytes, not 4 KB. Assuming that the page
size is 4 KB and hard-coding the value is a common error among PC
programmers—instead, supported platforms show page sizes from 4
KB to 64 KB, and sometimes they differ between different
implementations of the same platform. The relevant macros are
PAGE_SIZE
and PAGE_SHIFT
. The
latter contains the number of bits to shift an address to get its page
number. The number currently is 12 or greater, for 4 KB and bigger
pages. The macros are defined in
<asm/page.h>
; user-space programs can use
getpagesize if they ever need the information.
Let’s look at a nontrivial situation. If a driver needs 16 KB for
temporary data, it shouldn’t specify an order
of
2
to get_free_pages. You need
a portable solution. Using an array of #ifdef
conditionals may work, but it only accounts for platforms you care to
list and would break on other architectures, such as one that might be
supported in the future. We suggest that you use this code instead:
int order = (14 - PAGE_SHIFT > 0) ? 14 - PAGE_SHIFT : 0; buf = get_free_pages(GFP_KERNEL, order);
The solution depends on the knowledge that 16 KB is
1<<14
. The quotient of two numbers is the
difference of their logarithms (orders), and both
14
and PAGE_SHIFT
are orders.
The value of order
is calculated at compile time,
and the implementation shown is a safe way to allocate memory for any
power of two, independent of PAGE_SIZE
.
Be careful not to make assumptions about byte ordering. Whereas the PC stores multibyte values low-byte first (little end first, thus little-endian), most high-level platforms work the other way (big-endian). Modern processors can operate in either mode, but most of them prefer to work in big-endian mode; support for little-endian memory access has been added to interoperate with PC data and Linux usually prefers to run in the native processor mode. Whenever possible, your code should be written such that it does not care about byte ordering in the data it manipulates. However, sometimes a driver needs to build an integer number out of single bytes or do the opposite.
You’ll need to deal with endianness when you fill in network packet
headers, for example, or when you are dealing with a peripheral that
operates in a specific byte ordering mode. In that case, the code
should include <asm/byteorder.h>
and should
check whether __BIG_ENDIAN
or
__LITTLE_ENDIAN
is defined by the header.
You could code a bunch of #ifdef __LITTLE_ENDIAN
conditionals, but there is a better
way. The Linux kernel defines a set of macros that handle conversions
between the processor’s byte ordering and that of the data you need to
store or load in a specific byte order. For example:
u32 __cpu_to_le32 (u32); u32 __le32_to_cpu (u32);
These two macros convert a value from whatever the CPU uses to an unsigned, little-endian, 32-bit quantity and back. They work whether your CPU is big-endian or little-endian, and, for that matter, whether it is a 32-bit processor or not. They return their argument unchanged in cases where there is no work to be done. Use of these macros makes it easy to write portable code without having to use a lot of conditional compilation constructs.
There are dozens of similar routines; you can see the full list in
<linux/byteorder/big_endian.h>
and
<linux/byteorder/little_endian.h>
. After a
while, the pattern is not hard to follow.
__be64_to_cpu converts an unsigned,
big-endian, 64-bit value to the internal CPU representation.
__le16_to_cpus, instead, handles signed,
little-endian, 16-bit quantities. When dealing with pointers, you can
also use functions like __cpu_to_le32p,
which take a pointer to the value to be converted rather than the
value itself. See the include file for the rest.
Not all Linux versions defined all the macros that deal with byte
ordering. In particular, the linux/byteorder
directory appeared in version 2.1.72 to make order in the various
<asm/byteorder.h>
files and remove duplicate
definitions. If you use our sysdep.h, you’ll be
able to use all of the macros available in Linux 2.4 when compiling
code for 2.0 or 2.2.
The last problem worth considering when writing portable code is how to access unaligned data—for example, how to read a four-byte value stored at an address that isn’t a multiple of four bytes. PC users often access unaligned data items, but few architectures permit it. Most modern architectures generate an exception every time the program tries unaligned data transfers; data transfer is handled by the exception handler, with a great performance penalty. If you need to access unaligned data, you should use the following macros:
#include <asm/unaligned.h> get_unaligned(ptr); put_unaligned(val, ptr);
These macros are typeless and work for every data item, whether it’s one, two, four, or eight bytes long. They are defined with any kernel version.
Another issue related to alignment is portability of data structures across platforms. The same data structure (as defined in the C-language source file) can be compiled differently on different platforms. The compiler arranges structure fields to be aligned according to conventions that differ from platform to platform. At least in theory, the compiler can even reorder structure fields in order to optimize memory usage.[41]
In order to write data structures for data items that can be moved across architectures, you should always enforce natural alignment of the data items in addition to standardizing on a specific endianness. Natural alignment means storing data items at an address that is a multiple of their size (for instance, 8-byte items go in an address multiple of 8). To enforce natural alignment while preventing the compiler from moving fields around, you should use filler fields that avoid leaving holes in the data structure.
To show how alignment is enforced by the compiler, the
dataalign program is distributed in the
misc-progs
directory of the sample code, and an
equivalent kdataalign module is part of
misc-modules
. This is the output of the program
on several platforms and the output of the module on the SPARC64:
arch Align: char short int long ptr long-long u8 u16 u32 u64 i386 1 2 4 4 4 4 1 2 4 4 i686 1 2 4 4 4 4 1 2 4 4 alpha 1 2 4 8 8 8 1 2 4 8 armv4l 1 2 4 4 4 4 1 2 4 4 ia64 1 2 4 8 8 8 1 2 4 8 mips 1 2 4 4 4 8 1 2 4 8 ppc 1 2 4 4 4 8 1 2 4 8 sparc 1 2 4 4 4 8 1 2 4 8 sparc64 1 2 4 4 4 8 1 2 4 8 kernel: arch Align: char short int long ptr long-long u8 u16 u32 u64 kernel: sparc64 1 2 4 8 8 8 1 2 4 8
It’s interesting to note that not all platforms align 64-bit values on 64-bit boundaries, so you’ll need filler fields to enforce alignment and ensure portability.
[41] Field reordering doesn’t happen in currently supported architectures because it could break interoperability with existing code, but a new architecture may define field reordering rules for structures with holes due to alignment restrictions.
Get Linux Device Drivers, Second Edition now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.