Chapter 4. Development Tools

Much like mainstream software developers, embedded system developers need compilers, linkers, interpreters, integrated development environments, and other such development tools. The embedded developer’s tools are different, however, in that they typically run on one platform while building applications for another. This is why these tools are often called cross-platform development tools, or cross-development tools, for short.

This chapter discusses the setup, configuration, and use of cross-platform development tools. First, I will discuss how to use a practical project workspace. I will then discuss the GNU cross-platform development toolchain, the C library alternatives, Java, Perl, Python, Ada, other programming languages, integrated development environments, and terminal emulation programs.

Using a Practical Project Workspace

In the course of developing and customizing software for your target, you will need to organize various software packages and project components in a comprehensive and easy-to-use directory structure. Table 4-1 shows a suggested directory layout you may find useful. Feel free to modify this structure to fit your needs and requirements. When deciding where to place components, always try to find the most intuitive layout. Also, try to keep your own code in a directory completely separated from all the packages you will download from the Net. This will minimize any confusion regarding the source’s ownership and licensing status.

Table 4-1. Suggested project directory layout

Directory

Content

bootldr

The bootloader or bootloaders for your target

build-tools

The packages and directories needed to build the cross-platform development toolchain

debug

The debugging tools and all related packages

doc

All the documentation you will need for your project

images

The binary images of the bootloader, the kernel, and the root filesystem ready to be used on the target

kernel

The different kernel versions you are evaluating for your target

project

Your own source code for this project

rootfs

The root filesystem as seen by the target’s kernel at runtime

sysapps

The system applications required for your target

tmp

A temporary directory to experiment and store transient files

tools

The complete cross-platform development toolchain and C library

Of course, each of these directories contains many subdirectories. We will populate these directories as we continue through the rest of the book.

The location of your project workspace is up to you, but I strongly encourage you not to use a system-wide entry such as /usr or /usr/local. Instead, use an entry in your home directory or a directory within the /home directory shared by all the members of your group. If you really want to have a system-wide entry, you may want to consider using an entry in the /opt directory. For the example embedded control system, I have the following layout in my home directory:

$ ls -l ~/control-project
total 4
drwxr-xr-x   13 karim    karim        1024 Mar 28 22:38 control-module
drwxr-xr-x   13 karim    karim        1024 Mar 28 22:38 daq-module
drwxr-xr-x   13 karim    karim        1024 Mar 28 22:38 sysmgnt-module
drwxr-xr-x   13 karim    karim        1024 Mar 28 22:38 user-interface

Since they all run on different targets, each control system component has a separate entry in the control-project directory in my home directory. Each entry has its own project workspace as described above. Here is the daq-module workspace for example:

$ ls -l ~/control-project/daq-module
total 11
drwxr-xr-x    2 karim    karim        1024 Mar 28 22:38 bootldr
drwxr-xr-x    2 karim    karim        1024 Mar 28 22:38 build-tools
drwxr-xr-x    2 karim    karim        1024 Mar 28 22:38 debug
drwxr-xr-x    2 karim    karim        1024 Mar 28 22:38 doc
drwxr-xr-x    2 karim    karim        1024 Mar 28 22:38 images
drwxr-xr-x    2 karim    karim        1024 Mar 28 22:38 kernel
drwxr-xr-x    2 karim    karim        1024 Mar 28 22:38 project
drwxr-xr-x    2 karim    karim        1024 Mar 28 22:38 rootfs
drwxr-xr-x    2 karim    karim        1024 Mar 28 22:38 sysapps
drwxr-xr-x    2 karim    karim        1024 Mar 28 22:38 tmp
drwxr-xr-x    2 karim    karim        1024 Mar 28 22:38 tools

Because you may need to provide the paths of these directories to some of the utilities you will build and use, you may find it useful to create a short script that sets appropriate environment variables. Here is such a script called develdaq for the DAQ module:

export PROJECT=daq-module
export PRJROOT=/home/karim/control-project/${PROJECT}
cd $PRJROOT

In addition to setting environment variables, this script moves you to the directory containing the project. You can remove the cd command if you would prefer not to be moved to the project directory right away. To execute this script in the current shell so that the environment variables are immediately visible, type:[1]

$ . develdaq

Future explanations will rely on the existence of the PROJECT and PRJROOT environment variables.

Warning

Since the distribution on your workstation has already installed many of the same packages you will be building for your target, it is very important to clearly separate the two types of software. To ensure such separation, I strongly encourage you not to carry out any of the instructions provided in the rest of this book while being logged in as root, unless I provide explicit instructions otherwise. Among other things, this will avoid any possible destruction of the native GNU toolchain installed on your system and, most importantly, the C library most of your applications rely on. Therefore, instead of logging in as root, log in using a normal user account with no particular privileges.

GNU Cross-Platform Development Toolchain

The toolchain we need to put together to cross-develop applications for any target includes the binary utilities, such as ld, gas, and ar, the C compiler, gcc, and the C library, glibc. The rest of the discussion in the later chapters relies on the cross-platform development toolchain we will put together here.

You can download the components of the GNU toolchain from the FSF’s FTP site at ftp://ftp.gnu.org/gnu/ or any of its mirrors. The binutils package is in the binutils directory, the gcc package is in the gcc directory, and the glibc package is in the glibc directory along with glibc-linuxthreads. If you are using a glibc version older than 2.2, you will also need to download the glibc-crypt package, also from the glibc directory. This part of the library used to be distributed separately, because U.S. cryptography export laws made it illegal to download this package to a computer outside the U.S. from the FSF’s site, or any other U.S. site, for that matter. Since Version 2.2, however, glibc-crypt has been integrated as part of the main glibc package and there is no need to download this package separately anymore.[2] Following the project directory layout suggested earlier, download the packages into the ${PRJROOT}/build-tools directory.

Note that all the targets discussed in Chapter 3 are supported by the GNU toolchain.

GNU Toolchain Basics

Configuring and building an appropriate GNU toolchain is a complex and delicate operation that requires a good understanding of the dependencies between the different software packages and their respective roles. This knowledge is required, because the GNU toolchain components are developed and released independently from one another.

Component versions

The first step in building the toolchain is selecting the component versions we will use. This involves selecting a binutils version, a gcc version, and a glibc version. Because these packages are maintained and released independently from one another, not all versions of one package will build properly when combined with different versions of the other packages. You can try using the latest versions of each package, but this combination is not guaranteed to work either.

To select the appropriate versions, you have to test a combination tailored to your host and target. Of course, you may find it easier to ask around and see whether someone somewhere tested a certain combination of versions for that setup and reports that her combination works correctly. You may also have to try such combinations for your setup on your own if you do not find a known functional version combination. In that case, start with the most recent stable versions of each package and replace them one by one with older ones if they fail to build.

Tip

In some cases, the version with the highest version number may not have had the time to be tested enough to be considered “stable.” At the time glibc 2.3 was released, for example, it may have been a better choice to keep using glibc 2.2.5 until 2.3.1 became available.

At the time of this writing, for instance, the latest version of binutils is 2.13.2.1, the latest version of gcc is 3.2.1, and the latest version of glibc is 2.3.1. Most often, binutils will build successfully and you will not need to change it. Hence, let us assume that gcc 3.2.1 fails to build although all the appropriate configuration flags have been provided. In that case, I would revert to gcc 3.2. If that failed, I would try 3.1.1 and so on. It is the same thing with glibc. Version 2.3.1 of glibc may fail to build. In that case, I would revert to 2.3 and later to 2.2.5, if necessary.

You must understand, however, that you cannot go back like this indefinitely, because the most recent package versions expect the other packages to provide certain capabilities. You may, therefore, have to go back to older versions of packages that you successfully built if the other packages down the line fail to build. Using the above versions, for example, if I had to go back to glibc 2.1.3, it might be appropriate to change back to gcc 2.95.3 and binutils 2.10 although the most recent gcc and most recent binutils may have compiled perfectly.

You may also need to apply patches to some versions to get them to properly compile for your target. The web sites and mailing lists provided for each processor architecture in Chapter 3 are the best place to find such patches and package versions suggestions. Another place to look for patches is in the Debian source packages. Each package contains the patches required for all the architectures supported by that package.

Table 4-2 provides a list of known functional version combinations. For each host/target combination, known compatible versions are provided for binutils, gcc, and glibc. The last column indicates whether the tools require patching.

Table 4-2. Known functional package version combinations

Host

Target

Kernel

binutils

gcc

glibc

Patches

i386

PPC

2.10.1

2.95.3

2.2.1

No

PPC

i386

2.10.1

2.95.3

2.2.3

No

PPC

i386

2.13.2.1

3.2.1

2.3.1

No

i386

ARM

2.4.1-rmk1

2.10.1

2.95.3

2.1.3

Yes[3]

PPC

ARM

2.10.1

2.95.3

2.2.3

Yes[3]

i386

MIPS

2.8.1

egcs-1.1.2

2.0.6

Yes[5]

i386

SuperH

2.11.2

3.0.1

2.2.4

Yes[6]

Sparc (Solaris)

PPC

2.4.0

2.10.1

2.95.2

2.1.3

No

[3] See “The -Dinhibit_libc hack” subsection in the “Building the Toolchain” section of “The GNU toolchain” chapter in AlephOne’s “Guide to ARMLinux for Developers” (http://www.aleph1.co.uk/armlinux/book/book1.html) for further information on the modifications to be made to gcc to make it build successfully.

[5] See Ralf Bächle’s MIPS HOWTO (http://howto.linux-mips.org/ ) for further information on the patches to apply.

[6] See Bill Gatliff’s “Running Linux on the Sega Dreamcast” (http://www.linuxdevices.com/articles/AT7466555948.html ) for further information on the patches to apply.

Some of the combinations presented were on the Net as part of cross-platform development toolchain setups. I have kept the kernel version when the original explanation provided one. The kernel version, however, does not really matter for the build of the toolchain. Any recent kernel—Version 2.2.x or 2.4.x—known to work for your target can be used for the toolchain build. I strongly recommend using the actual kernel you will be using in your target, however, to avoid any future conflicts. I will discuss kernel selection in Chapter 5.

Although it is not specifically mentioned in the table, there is one glibc add-on that we will need for the toolchain: glibc-linuxthreads. The package’s versions closely follow glibc’s numbering scheme. Hence, the linuxthreads version matching glibc 2.2.3 is linuxthreads Version 2.2.3. Although I recommend getting the linuxthreads package, you should be able to build glibc without it. Note that glibc 2.1.x, for instance, does not build properly without linuxthreads. If you are using glibc 2.1.x, remember that you will also need to download the glibc-crypt add-on if you intend to use DES encryption.

By no means is Table 4-2 complete. There are many other combinations that will work just as well. Feel free to try newer versions than the ones presented. Use the same technique discussed earlier by starting with the latest versions and decrementing versions as needed. At worst, you will have to revert to setups presented above.

Whenever you discover a new version combination that compiles successfully, make sure you test the resulting toolchain to ensure that it is indeed functional. Some version combinations may compile successfully and still fail when used. Version 2.2.3 of glibc, for example, builds successfully for a PPC target on an x86 host using gcc 2.95.3. The resulting library is, nevertheless, broken and will cause a core dump when used on the target. In that particular setup, we can obtain a functional C library by reverting to glibc 2.2.1.

There are also cases where a version combination was found to work properly on certain processors within a processor family while failing to work on other processors of the same family. Versions of glibc earlier than 2.2, for example, worked fine for most PPC processors except those that were part of the MPC8xx series. The problem was that glibc assumed 32-byte cache lines for all PPC processors, while the processors in the MPC8xx series have 16-byte cache lines. Version 2.2 fixed this problem by assuming 16-byte cache lines for all PPC processors.

The following sections describe the building of the GNU toolchain for a PPC host and an i386 target using binutils 2.10.1, gcc 2.95.3, and glibc 2.2.3. This was the second entry in Table 4-2.

Build requirements

To build a cross-platform development toolchain, you will need a functional native toolchain. Most mainstream distributions provide this toolchain as part of their packages. If it was not installed on your workstation or if you chose not to install it to save space, you will need to install it at this point using the procedure appropriate to your distribution. With a Red Hat distribution, for instance, you will need to install the appropriate RPM packages.

You will also need a valid set of kernel headers for your host. These headers must usually be located in the /usr/include/linux, /usr/include/asm, and /usr/include/asm-generic directories, and should be the headers used to compile the native glibc installed on your system. In older distributions, and in some installations still, these directories are actually symbolic links to directories within the /usr/src/linux directory. In turn, this directory is itself a symbolic link to the actual kernel installed by your distribution. If your distribution uses the older setup, and you have updated your kernel or modified the content of the /usr/src/linux directory, you will need to make sure the /usr/src/linux symbolic link is set appropriately so that the symbolic links in /usr/include point to the kernel that was used to build your native glibc, and that was installed by your distribution. In recent distributions, however, the content of /usr/include/linux, /usr/include/asm, and /usr/include/asm-generic is independent of the content of /usr/src/linux, and no kernel update should result in such problems.

Build overview

With the appropriate tools in place, let us take a look at the procedure used to build the toolchain. These are the five main steps:

  1. Kernel headers setup

  2. Binary utilities setup

  3. Bootstrap compiler setup

  4. C library setup

  5. Full compiler setup

The first thing that you probably noticed, looking at these steps, is that the compiler seems to be built twice. This is normal and required, because some languages supported by gcc, such as C++, require glibc support. Hence, a first compiler is built with support for C only, and a full compiler is built once the C library is available.

Although I placed the kernel headers setup as the first step, the headers will not be used until the C library setup. Hence, you could alter the steps and set up the kernel headers right before the C library setup. Given the workspace directory layout we are using, however, you will find the original ordering of the steps given above to be more appropriate.

Obviously, each step involves many iterations of its own. Nonetheless, the steps remain similar in many ways. Most toolchain build steps involve carrying out the following actions:

  1. Unpack the package.

  2. Configure the package for cross-platform development.

  3. Build the package.

  4. Install the package.

Some toolchain builds differ slightly from this sequence. The kernel headers setup, for instance, does not require that we build the kernel or install it. Actually, we will save much of the discussions about configuring, building, and installing the kernel for Chapter 5. Also, since the compiler will have already been unpacked for the bootstrap compiler’s setup, the full compiler setup does not require unpacking the gcc package again.

Workspace setup

According to the workspace directory layout I suggested earlier, the toolchain will be built in the ${PRJROOT}/build-tools directory, while the components built will be installed in the ${PRJROOT}/tools directory. To this end, we need to define some additional environment variables. They ease the build process and are based on the environment variables already defined. Using the same example project as before, here is the new develdaq script with the new variables:

export PROJECT=daq-module
export PRJROOT=/home/karim/control-project/${PROJECT}
export TARGET=i386-linux
export PREFIX=${PRJROOT}/tools
export TARGET_PREFIX=${PREFIX}/${TARGET}
export PATH=${PREFIX}/bin:${PATH}
cd $PRJROOT

The TARGET variable defines the type of target for which your toolchain will be built. Table 4-3 provides some of the other possible values for TARGET. Notice that the target definition does not depend on the type of host. A target is defined by its own hardware and the operating system used on it, which is Linux in this case. Also, note that only TARGET needs to be modified in case we change targets. Of course, if we had already compiled the complete GNU toolchain for a different target, we would need to rebuild it after changing the value of TARGET. For a more complete list of TARGET values, look at the manual included in the glibc sources.

Table 4-3. Example values for TARGET

Actual target

Value of TARGET

PowerPC

powerpc-linux

ARM

arm-linux

MIPS (big endian)

mips-linux

MIPS (little endian)

mipsel-linux

SuperH 4

sh4-linux

The PREFIX variable provides the component configuration scripts with a pointer to the directory where we would like the target utilities to be installed. Conversely, TARGET_PREFIX is used for the installation of target-dependent header files and libraries. To have access to the newly installed utilities, we also need to modify the PATH variable to point to the directory where the binaries will be installed.

Some people prefer to set PREFIX to /usr/local. This results in the tools and libraries being installed within the /usr/local directory where they can be accessed by any user. I find this approach not to be useful for most situations, however, because even projects using the same target architecture may require different toolchain configurations.

If you need to set up a toolchain for an entire development team, instead of sharing tools and libraries via the /usr/local directory, I suggest that a developer build the toolchain within an entry shared by all project members in the /home directory, as I said earlier. In a case in which no entry in the /home directory is shared among group members, a developer may build the toolchain within an entry in her workstation’s /opt directory and then share her resulting ${PRJROOT}/tools directory with her colleagues. This may be done using any of the traditional sharing mechanisms available, such as NFS, or using a tar-gzipped archive available on an FTP server. Each developer using the package will have to place it in a filesystem hierarchy identical to the one used to build the toolchain for the tools to operate adequately. In a case in which the toolchain was built within the /opt directory, this means placing the toolchain in the /opt directory.

If you choose to set PREFIX to /usr/local, you will also have to issue the commands shown below while being logged-in as root, with all the risks this entails. You could also set the permission bits of the /usr/local directory to allow yourself or your user group to issue the commands without requiring root privileges.

Notice that TARGET_PREFIX is set to ${PREFIX}/${TARGET}, which is a target-dependent directory. If you set PREFIX to /usr/local, successive installations of development toolchains for different targets will result in the libraries and header files of the latest installation being placed in different directories from the libraries and header files of previous toolchain installations.

Regardless of the value you give to PREFIX, setting TARGET_PREFIX to ${PREFIX}/${TARGET} is the configuration the GNU toolchain utilities expect to find during their configuration and installation. Hence, I strongly suggest that you use this value for TARGET_PREFIX. The following explanations may require changes if you modify TARGET_PREFIX’s value.

Again, you can remove the cd command from the script if you would prefer not to move directly to the project directory.

Preparing the build-tools directory

At this point, you should have the different packages required for building the toolchain in the build-tools directory. As with other packages, a new directory will be created when you extract the files from the package archive. This new directory will contain the complete source code necessary to build the packages and all appropriate Makefiles. Although it is possible to build the package within this source directory, I highly recommend that you build each package in a directory separate from its source directory, as is suggested in the FSF’s installation manuals.

Building a package in a directory different from the one holding its source may seem awkward if you are used to simply typing configure; make; make install, but I will shortly explain how this is done. First, though, we need to create the directories that will hold the packages being built. Create one directory for each toolchain component. Four directories are therefore needed: one for the binutils, one for the bootstrap C compiler, one for the C library, and one for the complete compiler. We can use the following commands to create the necessary entries:

$ cd ${PRJROOT}/build-tools
$ mkdir build-binutils build-boot-gcc build-glibc build-gcc

We can now look at the content of the build-tools directory with the packages and the build directories (the last line in this example is truncated to fit the page):

$ ls -l
total 35151
-rw-r--r--    1 karim    karim     7284401 Apr  4 17:33 binutils-2.10.1.tar.gz
drwxrwxr-x    2 karim    karim        1024 Apr  4 17:33 build-binutils
drwxrwxr-x    2 karim    karim        1024 Apr  4 17:33 build-boot-gcc
drwxrwxr-x    2 karim    karim        1024 Apr  4 17:33 build-gcc
drwxrwxr-x    2 karim    karim        1024 Apr  4 17:33 build-glibc
-rw-r--r--    1 karim    karim    12911721 Apr  4 17:33 gcc-2.95.3.tar.gz
-rw-r--r--    1 karim    karim    15431745 Apr  4 17:33 glibc-2.2.3.tar.gz
-rw-r--r--    1 karim    karim      215313 Apr  4 17:33 glibc-linuxthreads-2.2.3.t

Everything is now almost ready for building the actual toolchain.

Resources

Before proceeding to the actual building of the toolchain, let us look at some resources you may find useful in case you run into problems during the build process.

First and foremost, each package comes with its own documentation. Although you will find the binutils package to be the leanest in terms of installation documentation, it is also the least likely to cause any problems. The gcc and glibc packages, however, are amply documented. Within the gcc package, you will find an FAQ file and an install directory containing instructions on how to configure and install gcc. This includes an extensive explanation of the build configuration options. Similarly, the glibc package contains an FAQ and INSTALL files. The INSTALL file covers the build configuration options and the installation process, and provides recommendations for compilation tool versions.

In addition, you may want to try using a general search engine such as Google to look for reports by other developers who may have already encountered and solved problems similar to yours. Often, using a general search engine will be the most effective way to solve a GNU toolchain build problem.

On the matter of cross-compiling, there are two CrossGCC FAQs available: the Scott Howard FAQ and the Bill Gatliff FAQ. The Scott Howard CrossGCC FAQ is available at http://www.sthoward.com/CrossGCC/. This FAQ is rather outdated, however. The Bill Gatliff CrossGCC FAQ is available at http://crossgcc.billgatliff.com/.

Though the Scott Howard FAQ is outdated, and though it isn’t limited to Linux and attempts to provide general explanations for all the platforms the GNU toolchain can be built for, it does provide pieces of information that can be hard to find otherwise. It covers, for instance, what is known as Canadian Crosses,[7] a technique for building cross-platform development tools for development on another platform. An example of this would be building cross-platform development tools for an ARM target and an i386 host on a PPC workstation.

As with the Scott Howard FAQ, the Bill Gatliff FAQ is not limited to Linux. In addition to the FAQ, Bill Gatliff actively maintains a CrossGCC Wiki site, which provides information on a variety of cross-platform development issues, including tutorials, links to relevant articles, and explanations about GNU toolchain internals. Since this is a Wiki site, you can register to modify and contribute to the site yourself. The Wiki site is accessed through the same URL as the Bill Gatliff FAQ.

Both FAQs provide scripts to automate the building of the toolchain. Similar scripts are also available from many other sites. You may be interested in taking a look at these scripts, but I will not rely on any scripts for my future explanations as I would rather you fully understand all the steps involved.

Finally, there is a crosgcc mailing list hosted by Red Hat at http://sources.redhat.com/ml/crossgcc/. You will find this mailing list quite useful if you ever get stuck, because many on this list have a great deal of experience with the process of building cross-platform development toolchains. Often, just searching or browsing the archive will help you locate immediate answers to your questions.

A word on prebuilt cross-platform toolchains

A lot of prebuilt cross-platform toolchains are available either online or commercially. Since I do not know the actual process by which each was built, I cannot offer any advice regarding those packages. You may still choose to use such packages out of convenience instead of carrying out the procedure explained here. In that case, make sure you have documentation as to how these packages were configured and built. Most importantly, make sure you know what package versions were used, what patches were applied, if any, and where to get the patches that were applied in case you need them.

Kernel Headers Setup

As I said earlier, the setup of the kernel headers is the first step in building the toolchain. In this case, we are using kernel Version 2.4.18, but we could have used any other version appropriate for our target. We will discuss kernel selection further in Chapter 5.

Having selected a kernel, the first thing you need to do is download a copy of that kernel into the directory in which you have chosen to store kernels. In the case of the workspace hierarchy I suggested earlier, this would be in ${PRJROOT}/kernel. You can obtain all the Linux kernels from the main kernel repository at http://www.kernel.org/ or any other mirror site, such as the national mirrors.[8] There are other sites that provide kernels more adapted to certain targets, and I will cover these in Chapter 5.

For some time now, each version of the kernel has been available both as a tar-gzipped file (with the .tar.gz extension) and as a tar-bzip2'd file (with the .tar.bz2 extension). Both contain the same kernel, except that tar-bzip2'd files are smaller and require less download time than tar-gzipped files.

With the kernel now in your kernel directory, you can extract it using the appropriate command. In our case, we use one of the following commands, depending on the file we downloaded:

$ tar xvzf linux-2.4.18.tar.gz

or:

$ tar xvjf linux-2.4.18.tar.bz2

Some older versions of tar do not support the j option and you may need to use bzip2 -d or bunzip2 to decompress the archive before using tar.

For all kernels up to 2.4.18, the tar command creates a directory called linux that contains the extracted files from the archive. Starting with 2.4.19, however, the kernel extracts immediately into a directory that has the version number appended to its name. Hence, Linux 2.4.19 extracts directly into the linux-2.4.19 directory. This avoids accidently overwriting an older kernel with a new one. If you are using a kernel that is older than 2.4.19, I recommend that you rename the directory right away to avoid any accidental overwriting:

$ mv linux linux-2.4.18

Overwriting a kernel version with another because the directory of the previous version wasn’t renamed is a common and often costly mistake, so it is really important that you rename the directory as soon as you extract the kernel from the archive, if need be.

With the kernel now extracted, we proceed to configuring it:

$ cd linux-2.4.18
$ make ARCH=i386 CROSS_COMPILE=i386-linux- menuconfig

This will display a menu in your console where you will be able to select your kernel’s configuration. Instead of menuconfig, you can specify config or xconfig. The former requires that you provide an answer for every possible configuration option one by one at the command line. The latter provides an X Window dialog, which is often considered the most intuitive way to configure the kernel. Beware of xconfig, however, as it may fail to set some configuration options and forget to generate some headers required by the procedure I am describing. The use of config may also result in some headers not being created. You can check whether the kernel configuration has successfully created the appropriate headers by verifying whether the include/linux/version.h file exists in the kernel sources after you finish the configuration process. If it is absent, the instructions outlined below will fail at the first instance where kernel headers are required; usually during the compilation of glibc.

As you probably noticed, the values of ARCH and CROSS_COMPILE depend on your target’s architecture type. Had this been a PPC target and an i386 host, we would have used ARCH=ppc and CROSS_COMPILE=powerpc-linux-. (The trailing hyphen in the CROSS_COMPILE=powerpc-linux- variable is not an accident.) Strictly speaking, it is not necessary to set CROSS_COMPILE for all kernel make targets. The various configuration targets I just covered don’t usually need it, for example. In fact, it is only necessary when code is actually being cross-compiled as a result of the kernel Makefile rules. Nevertheless, I will continue to specify it for all kernel make targets throughout this book, even when it isn’t essential, to highlight its importance. You are free to set it only when needed in your actual day-to-day work.

I will cover the intricacies of kernel configuration in Chapter 5. If you are not familiar with kernel configuration, you may want to have a peek at that chapter first. The most important configuration options we need to set at this time are the processor and system type. Although it is preferable to fully configure the kernel before proceeding, just setting the processor and system type is usually enough to generate the appropriate headers for the toolchain build.

With the kernel now configured, exit the menu by selecting the Exit item with your right arrow. The configuration utility then asks you if you want to save the configuration and, upon confirmation, proceeds to write the kernel configuration and creates the appropriate files and links.

We can now create the include directory required for the toolchain and copy the kernel headers to it:

$ mkdir -p ${TARGET_PREFIX}/include
$ cp -r include/linux/ ${TARGET_PREFIX}/include
$ cp -r include/asm-i386/ ${TARGET_PREFIX}/include/asm
$ cp -r include/asm-generic/ ${TARGET_PREFIX}/include

Keep in mind that we are using a PPC host and an i386 target. Hence, the asm-i386 directory in the path above is the directory containing the target-specific headers, not the host-specific ones. If this were a PPC target, for example, we would have to replace asm-i386 with asm-ppc.

Note that you will not need to rebuild the toolchain every time you reconfigure the kernel. The toolchain needs one valid set of headers for your target, which is provided by the procedure given earlier. You may later choose to reconfigure your kernel or use another one entirely without impacting your toolchain, unless you change the processor or system type.

Binutils Setup

The binutils package includes the utilities most often used to manipulate binary object files. The two most important utilities within the package are the GNU assembler, as, and the linker, ld. Table 4-4 contains the complete list of utilities found in the binutils package.

Table 4-4. Utilities found in the binutils package

Utility

Use

as

The GNU assembler

ld

The GNU linker

gasp

The GNU assembler pre-processor

ar

Creates and manipulates archive content

nm

Lists the symbols in an object file

objcopy

Copies and translates object files

objdump

Displays information about the content of object files

ranlib

Generates an index to the content of an archive

readelf

Displays information about an ELF format object file

size

Lists the sizes of sections within an object file

strings

Prints the strings of printable characters in object files

strip

Strips symbols from object files

c++filt

Converts low-level mangled assembly labels resulting from overloaded c++ functions into their user-level names

addr2line

Converts addresses into line numbers within original source files

Note that although as supports many processor architectures, it does not necessarily recognize the same syntax as the other assemblers available for a given architecture. The syntax recognized by as is actually a machine-independent syntax inspired by BSD 4.2 assembly language.

The first step in setting up the binutils package is to extract its source code from the archive we downloaded earlier:

$ cd ${PRJROOT}/build-tools
$ tar xvzf binutils-2.10.1.tar.gz

This will create a directory called binutils-2.10.1 with the package’s content. We can now move to the build directory for the second part of the build process, the configuration of the package for cross-platform development:

$ cd build-binutils
$ ../binutils-2.10.1/configure --target=$TARGET --prefix=${PREFIX}
Configuring for a powerpc-unknown-linux-gnu host.
Created "Makefile" in /home/karim/control-project/daq-module/build-...
Configuring intl...
creating cache ../config.cache
checking for a BSD compatible install... /usr/bin/install -c
checking how to run the C preprocessor... gcc -E
checking whether make sets ${MAKE}... yes
checking for gcc... gcc
checking whether the C compiler (gcc -g -O2 -W -Wall ) works... yes
checking whether the C compiler (gcc -g -O2 -W -Wall ) is a cross-c...
checking whether we are using GNU C... yes
checking whether gcc accepts -g... yes
checking for ranlib... ranlib
checking for POSIXized ISC... no
checking for ANSI C header files... yes
          ... 

What I’ve shown is only part of the output from the configure script. It will actually continue printing similar messages on the console until it has prepared each utility in the package for compilation. This may take a minute or two to complete, but it is a relatively short operation.

During its run, configure checks for the availability of certain resources on the host and creates appropriate Makefiles for each tool in the package. Since the command is not being issued in the directory containing the binutils source code, the result of the configure command will be found in the directory where it was issued, the build-binutils directory.

We control the creation of the Makefiles by passing the appropriate options to configure. The - -target option enables us to specify the target for which the binutils are being built. Since we had already specified the name of the target in the TARGET environment variable, we provide this variable as is. The - -prefix option enables us to provide the configuration script with the directory within which it should install files and directories. The directory for - -prefix is the same as the one we specified earlier in the PREFIX environment variables.

With the Makefiles now ready, we can build the actual utilities:

$ make

The actual build of the binutils may take anywhere between 10 and 30 minutes, depending on your hardware. Using a 400 MHz PowerBook host, it takes at most 15 minutes to build the binutils for the i386 target used here. You may see some warnings during the build but they can be ignored, unless you’re one of the binutils developers.

With the package now built, we can install the binutils:

$ make install

The binutils have now been installed inside the directory pointed to by PREFIX. You can check to see that they have been installed properly by listing the appropriate directory:

$ ls ${PREFIX}/bin
i386-linux-addr2line  i386-linux-ld       i386-linux-readelf
i386-linux-ar         i386-linux-nm       i386-linux-size
i386-linux-as         i386-linux-objcopy  i386-linux-strings
i386-linux-c++filt    i386-linux-objdump  i386-linux-strip
i386-linux-gasp       i386-linux-ranlib

Notice that the name of each utility is prepended by the value of TARGET we set earlier. Had the target been a powerpc-linux, for instance, the names of the utilities would have been prepended with powerpc-linux-. When building an application for a target, we can therefore use the appropriate tools by prepending the name of the target type.

A copy of some of the utilities without the prepended target name will also be installed in the ${PREFIX}/${TARGET}/bin directory. Since this directory will later be used to install target binaries by the C library build process, we will need to move the host binaries to a more appropriate directory. For now, we will leave them as is and address this issue later.

Bootstrap Compiler Setup

In contrast to the binutils package, the gcc package contains only one utility, the GNU compiler, along with support components such as runtime libraries. At this stage, we will build the bootstrap compiler, which will support only the C language. Later, once the C library has been compiled, we will recompile gcc with full C++ support.

Again, we start by extracting the gcc package from the archive we downloaded earlier:

$ cd ${PRJROOT}/build-tools
$ tar xvzf gcc-2.95.3.tar.gz

This will create a directory called gcc-2.95.3 with the package’s content. We can now proceed to the configuration of the build in the directory we had prepared for the bootstrap compiler:

$ cd build-boot-gcc
$ ../gcc-2.95.3/configure --target=$TARGET --prefix=${PREFIX} \
               > --without-headers --with-newlib --enable-languages=c

This will print output similar to that printed by the binutils configuration utility we discussed earlier. Here too, configure checks for the availability of resources and builds appropriate Makefiles.

The - -target and - -prefix options given to configure have the same purpose as with binutils, to specify the target and the installation directory, respectively. In addition, we use options that are required for building a bootstrap compiler.

Since this is a cross-compiler and there are no system header files for the target yet—they will be available once glibc is built—we need to use the - -without-headers option. We also need to use the - -with-newlib option to tell the configuration utility not to use glibc, since it has not yet been compiled for the target. This option, however, does not force us to use newlib as the C library for the target. It is just there to enable gcc to properly compile, and we will be free to choose any C library at a later time.

The - -enable-languages option tells the configuration script which programming languages we expect the resulting compiler to support. Since this is a bootstrap compiler, we need only include support for C.

Depending on your particular setup, you may want to use additional options for your target. For a complete list of the options recognized by configure, see the installation documentation provided with the gcc package.

With the Makefiles ready, we can now build the compiler:

$ make all-gcc

The compile time for the bootstrap compiler is comparable to that of the binutils. Here, too, you may see warnings during the compilation, and you can safely ignore them.

With the compilation complete, we can now install gcc:

$ make install-gcc

The bootstrap compiler is now installed alongside the binutils, and you can see it by relisting the content of ${PREFIX}/bin. The name of the compiler, like the utilities, is prepended with the name of the target and is called i386-linux-gcc in our example.

C Library Setup

The glibc package is made up of a number of libraries and is the most delicate and lengthy package build in our cross-platform development toolchain. It is an extremely important software component on which most, if not all, applications available or being developed for your target will rely. Note that although the glibc package is often called the C library—a confusion maintained within GNU’s own documentation—glibc actually generates many libraries, one of which is the actual C library, libc. We will discuss the complete list of libraries generated by glibc in Chapter 6. Until then, I will continue to use “C library” and “glibc” interchangeably.

As with the previous packages, we start by extracting the C library from the archive we downloaded earlier:

$ cd ${PRJROOT}/build-tools
$ tar xvzf glibc-2.2.3.tar.gz

This will create a directory called glibc-2.2.3 with the package’s content. In addition to extracting the C library, we extract the linuxthreads package in the glibc directory for the reasons stated earlier in the chapter:

$ tar -xvzf glibc-linuxthreads-2.2.3.tar.gz --directory=glibc-2.2.3

We can now proceed to preparing the build of the C library in the build-glibc directory:

$ cd build-glibc
$ CC=i386-linux-gcc ../glibc-2.2.3/configure --host=$TARGET \
               > --prefix="/usr" --enable-add-ons \
               > --with-headers=${TARGET_PREFIX}/include

Notice that this configuration command is somewhat different from the previous ones. First, we precede the call to configure with CC=i386-linux-gcc. The effect of this command is to set the CC environment variable to i386-linux-gcc. Therefore, the compiler used to build the C library will be the bootstrap cross-compiler we have just built. Also, we now use the - -host option instead of the - -target option, since the library runs on our target and not on our build system.[9] In other words, the host from the point of view of the library is our target, contrary to the tools we built earlier, which all run on our build system.

Although we still use the - -prefix option, its purpose here is to indicate to the configuration script the location of the library components once on the target’s root filesystem. This location is then hardcoded into the glibc components during their compilation and used at runtime. As is explained in the INSTALL file in the glibc source directory, Linux systems expect to have some glibc components installed in /lib and others in /usr/lib. By setting - -prefix to /usr, the configuration script recognizes this setup and the relevant directory paths are properly hardcoded in the glibc components. As a result, the dynamic linker, for example, will expect to find shared libraries in /lib, which is the appropriate location for these libraries in any Linux system, as we shall see in Chapter 6. We will not, however, let the build script install the libraries into the build system’s /usr directory. Rather, as we shall see later in this section, we will override the install directory when issuing the make install command.

We also instruct the configuration script to use the add-on we downloaded with the - -enable-add-ons option. Since we are using linuxthreads only, we could have given the exact list of add-ons we want to be configured by using the - -enable-add-ons=linuxthreads option. If you are using glibc 2.1.x and had applied the glibc-crypt add-on, you would need to use the - -enable-add-ons=linuxthreads,crypt option instead. The full command I provided earlier, which doesn’t include the full list of add-ons, will work fine nonetheless with most glibc versions.

Finally, we tell the configuration script where to find the kernel headers we set up earlier using the - -with-headers option. If this option was omitted, the headers found through /usr/include would be used to build glibc and these would be inappropriate, since they are the build system’s headers, not the target’s.

During the actual build of the library, three sets of libraries are built: a shared set, a static set, and a static set with profiling information. If you do not intend to use the profiling version of the library, you may instruct the configuration script not to include it as part of the build process by using the - -disable-profile option. The same applies to the shared set, which can be disabled using the - -disable-shared option. If you do not intend to have many applications on your target and plan to link all your applications statically, you may want to use this option. Be careful, however, as your target may eventually need the shared library. You can safely leave its build enabled and still link your applications statically. At least then you will be able to change your mind about how to link your application without having to rebuild the C library.

Another option that has an effect on static versus dynamic linking is - -enable-static-nss. This option generates libraries which enable the static linking of the Name Service Switch (NSS) components. In brief, the NSS part of glibc allows some of the library components to be customizable to the local configuration. This involves the use of the /etc/nsswitch.conf file to specify which /lib/libnss_ NSS_SERVICE library is loaded at runtime. Because this service is specifically designed to load libraries dynamically, it doesn’t allow true static linking unless it is forced to. Hence, if you plan to statically link applications that use NSS, add the - -enable-static-nss option to the configuration script’s command line. The web servers discussed in Chapter 10, for example, use NSS and will either not function properly on the target or will simply fail to build if you instruct the linker to statically link them against a glibc that doesn’t allow static NSS linking. Look at the glibc manual for a complete discussion of NSS.

If you are compiling glibc for a target that lacks an FPU, you may also want to use the - -without-fp option to build FPU emulation into the C library. In some cases, you may also need to add the -msoft-float option to the C flags used to build the library. In the case of the PPC, at least, the C flags are appropriately set (since glibc 2.3) whenever - -without-fp is used to configure glibc.

If you have chosen not to download the linuxthreads package, or the crypt package if you were using glibc 2.1.x, you may try to compile the C library by removing the - -enable-add-ons option and adding the - -disable-sanity-checks option. Otherwise, the configuration script will complain about the missing linuxthreads. Note, however, that although glibc may build successfully without linuxthreads, it is possible that the full compiler’s build will fail when including C++ support later.

With the configuration script done, we can now compile glibc:

$ make

The C library is a very large package and its compilation may take several hours, depending on your hardware. On the PowerBook system mentioned earlier, the build takes approximately an hour. Regardless of your platform, this is a good time to relax, grab a snack, or get some fresh air. One thing you may want to avoid is compiling the C library in the background while trying to use your computer for other purposes in the meantime. As I said earlier, the compilation of some of the C library’s components uses up a lot of memory, and if the compiler fails because of the lack of available memory, you may have to restart the build of the library from scratch using make clean followed by make. Some parts of the build may not be restarted gracefully if you just retype make.

Once the C library is built, we can now install it:

$ make install_root=${TARGET_PREFIX} prefix="" install

In contrast to the installation of the other packages, the installation of the C library will take some time. It won’t take as much time as the compilation, but it may take between 5 and 10 minutes, again depending on your hardware.

Notice that the installation command differs from the conventional make install. We set the install_root variable to specify the directory where we want the library’s components to be installed. This ensures that the library and its headers are installed in the target-dependent directory we had assigned to TARGET_PREFIX earlier, not in the build system’s /usr directory. Also, since the use of the - -prefix option sets the prefix variable’s value and since the value of prefix is appended to install_root’s value to provide the final installation directory of the library’s components, we reset the value of prefix so that all glibc components are installed directly in the ${TARGET_PREFIX} directory. Hence, the glibc components that would have been installed in ${TARGET_PREFIX}/usr/lib are installed in ${TARGET_PREFIX}/lib instead.

If you are building tools for a target that is of the same architecture as your host (compiling for a PPC target on a PPC host, for instance), you may want to set the cross-compiling variable to yes as part of the make install command. Because the library’s configure script will have detected that the architectures are identical during the build configuration, the Makefile assumes that you are not cross-compiling and the installation of the C library fails as a result of the Makefile using a different set of rules.

There is one last step we must carry out to finalize glibc’s installation: the configuration of the libc.so file. This file is used during the linking of applications to the C library and is actually a link script. It contains references to the various libraries needed for the real linking. The installation carried out by our make install above assumes that the library is being installed on a root filesystem and hence uses absolute pathnames in the libc.so link script to reference the libraries. Since we have installed the C library in a nonstandard directory, we must modify the link script so that the linker will use the appropriate libraries. Along with the other components of the C library, the link script has been installed in the ${TARGET_PREFIX}/lib directory.

In its original form, libc.so looks like this:

/* GNU ld script
   Use the shared library, but some functions are only in
   the static library, so try that secondarily.  */
GROUP ( /lib/libc.so.6 /lib/libc_nonshared.a )

This is actually quite similar, if not identical, to the libc.so that has already been installed by your distribution for your native C library in /usr/lib/. Since you may need your target’s default script sometime, I suggest you make a copy before modifying it:

$ cd ${TARGET_PREFIX}/lib
$ cp ./libc.so ./libc.so.orig

You can now edit the file and remove all absolute path references. In essence, you will need to remove /lib/ from all the library filenames. The new libc.so now looks like this:

/* GNU ld script
   Use the shared library, but some functions are only in
   the static library, so try that secondarily.  */
GROUP ( libc.so.6 libc_nonshared.a )

By removing the references to an absolute path, we are now forcing the linker to use the libraries found within the same directory as the libc.so script, which are the appropriate ones for your target, instead of the native ones found on your host.

Full Compiler Setup

We are now ready to install the full compiler for your target with both C and C++ support. Since we had already extracted the compiler from its archive in Section 4.2.4, we will not need to repeat this step. Overall, the build of the full compiler is much simpler than the build of the bootstrap compiler.

From the build-tools/build-gcc directory enter:

$ cd ${PRJROOT}/build-tools/build-gcc
$ ../gcc-2.95.3/configure --target=$TARGET --prefix=${PREFIX} \
               > --enable-languages=c,c++

The options we use here have the same meaning as when building the bootstrap compiler. Notice, however, that there are fewer options and that we now add support for C++ in addition to C. If you had set TARGET_PREFIX to something other than ${PREFIX}/${TARGET} as we did earlier, you will need to use the - -with-headers and - -with-libs options to tell the configuration script where to find the headers and libraries installed by glibc.

With the full compiler properly configured, we can now build it:

$ make all

This build will take slightly longer than the build of the bootstrap compiler. Again, you may see warnings that you can ignore. Notice that we didn’t use all-gcc as with the bootstrap compiler, but rather all. This will result in the build of all the rest of the components included with the gcc package, including the C++ runtime libraries.

If you didn’t properly configure the libc.so link script file as previously explained, the build will fail during the compilation of the runtime libraries. Also, if you didn’t install the linuxthreads package during the C library setup, the compilation may fail under some versions of gcc. Version 2.95.3 of gcc, for instance, will fail to build without linuxthreads.

With the full compiler now built, we can install it:

$ make install

This will install the full compiler over the bootstrap compiler we had previously installed. Notice that we didn’t use install-gcc as we had done earlier for the bootstrap compiler, but rather install. Again, this is because we are now installing both gcc and its support components.

Finalizing the Toolchain Setup

The full cross-platform development toolchain is now set up and almost ready to be used. I have only a couple of final observations left.

First, let’s take a look at what has been installed in the tools directory and how we will be using it in the future. Table 4-5 provides the list of first-level subdirectories found in the tools directory.

Table 4-5. Contents of the ${PRJROOT}/tools directory

Directory

Content

bin

The cross-development utilities.

i386-linux

Target-specific files.

include

Headers for cross-development tools.

info

The gcc info files.

lib

Libraries for cross-development tools.

man

The manual pages for cross-development tools.

share

The files shared among cross-development tools and libraries. This directory is empty.

The two most important directories are bin and i386-linux. The first contains all the tools within the cross-development toolchain that we will use on the host to develop applications for the target. The second contains all the software components to be used on the target. Mainly, it contains the header files and runtime libraries for the target. Table 4-6 provides a list of the first-level subdirectories found in i386-linux.

Table 4-6. Contents of the ${PRJROOT}/tools/i386-linux directory

Directory

Content

bin

glibc-related target binaries and scripts.

etc

Files that should be placed in the target’s /etc directory. Only contains the rpc file.

include

The headers used to build applications for the target.

info

The glibc info files.

lib

The target’s /lib directory.

libexec

Binary helpers. This directory only contains pt_chown, which you will not need for most targets.

sbin

The target’s /sbin directory.

share

Subdirectories and files related to internationalization.

sys-include

Would have been used by the gcc configuration script to copy the target’s headers had glibc not installed the main target headers in the include directory.

Within the i386-linux directory, the two most important directories are include and lib. The first contains the header files that will be used to build any application for the target. The second contains the runtime libraries for the target.

Notice that this last directory contains a lot of large libraries. By itself, the directory weighs in at around 80 MB. Most embedded systems do not have this quantity of storage available. As we will see in Section 4.3, there are other libraries that can be used instead of glibc. Also, we will see in Chapter 6 ways to minimize the number and size of the libraries you choose to use.

As I said earlier, a copy of some of the host utilities without the prepended target name have been installed in the ${PREFIX}/${TARGET}/bin directory. Since this directory now contains target binaries installed by the C library build process, I highly suggest that you move the host binaries out of this directory and into another directory more appropriate for host binaries. The utilities affected by this are as, ar, gcc, ld, nm, ranlib, and strip. You can verify that these are indeed host binaries using the file command:

$ cd ${PREFIX}/${TARGET}/bin
$ file as ar gcc ld nm ranlib strip
as:     ELF 32-bit MSB executable, PowerPC or cisco 4500, version 1...
ar:     ELF 32-bit MSB executable, PowerPC or cisco 4500, version 1...
gcc:    ELF 32-bit MSB executable, PowerPC or cisco 4500, version 1...
ld:     ELF 32-bit MSB executable, PowerPC or cisco 4500, version 1...
nm:     ELF 32-bit MSB executable, PowerPC or cisco 4500, version 1...
ranlib: ELF 32-bit MSB executable, PowerPC or cisco 4500, version 1...
strip:  ELF 32-bit MSB executable, PowerPC or cisco 4500, version 1...

We must choose an appropriate directory in which to put these binaries and create symbolic links to the relocated binaries, because some GNU utilities, including gcc, expect to find some of the other GNU utilities in ${PREFIX}/${TARGET}/bin and will use the host’s utilities if they can’t find the target’s binaries there. Naturally, this will result in failed compilations, since the wrong system’s tools are used. The compiler has a default search path it uses to look for binaries. We can view this path using one of the compiler’s own options (some lines wrap; your shell will take care of line wrapping):

$ i386-linux-gcc -print-search-dirs

install: /home/karim/control-project/daq-module/tools/lib/gcc-lib/i386-linux/2.95.3/
programs: /home/karim/control-project/daq-module/tools/lib/gcc-lib/i386-linux/2.95.3/
:/home/karim/control-project/daq-module/tools/lib/gcc-lib/i386-linux/:/usr/lib/gcc/
i386-linux/2.95.3/:/usr/lib/gcc/i386-linux/:/home/karim/control-project/daq-module/
tools/i386-linux/bin/i386-linux/2.95.3/:/home/karim/control-project/daq-module/tools/
i386-linux/bin/
libraries: /home/karim/control-project/daq-module/tools/lib/gcc-lib/i386-linux/2.95.
3/:/usr/lib/gcc/i386-linux/2.95.3/:/home/karim/control-project/daq-module/tools/i386-
linux/lib/i386-linux/2.95.3/:/home/karim/control-project/daq-module/tools/i386-linux/
lib/

The first entry on the programs line, ${PREFIX}/lib/gcc-lib/i386-linux/2.95.3, is a directory containing gcc libraries and utilities. By placing the binaries in this directory, you can make the cross-compiler use them instead of the native tools:

$ mv as ar gcc ld nm ranlib strip \
               > ${PREFIX}/lib/gcc-lib/i386-linux/2.95.3

Meanwhile, the native toolchain will continue to operate normally. We can also create symbolic links to the relocated binaries just in case an application still looks for the utilities only in ${PREFIX}/${TARGET}/bin. Most applications will not look exclusively in this directory, however, and you can almost always skip this step. One case requiring these symbolic links is when you need to recompile components of the GNU cross-platform development toolchain for your target. Nonetheless, because these are symbolic links to host binaries instead of the host binaries themselves, it is easier to tell them apart from the target binaries in case you need to copy the content of the ${PREFIX}/${TARGET}/bin directory to your target’s root filesystem. The following script makes the links:

$ for file in as ar gcc ld nm ranlib strip
               > do
               > ln -s ${PREFIX}/lib/gcc-lib/i386-linux/2.95.3/$file .
               > 
               done

Regardless of the type of host or the gcc version you use, a directory similar to ${PREFIX}/lib/gcc-lib/i386-linux/2.95.3 will be created during the building of the cross-platform development toolchain. As you can see, the directory path is made up of the target type and the gcc version. Your particular directory should be located in ${PREFIX}/lib/gcc-lib/${TARGET}/ GCC_VERSION, where GCC_VERSION is the version of gcc you are using in your cross-platform development toolchain.

Finally, to save disk space, you may choose to get rid of the content of the ${PRJROOT}/build-tools directory once you have completed the installation of the toolchain components. This may be very tempting, as the build directory now occupies around 600 MB of disk space. I advise you to think this through carefully, nonetheless, and not rush to use the rm -rf command. An unforeseen problem may require that you delve into this directory again at a future time. If you insist upon reclaiming the space occupied by the build directory, a compromise may be to wait a month or two and see if you ever need to come back to it.

Using the Toolchain

You now have a fully functional cross-development toolchain, which you can use very much as you would a native GNU toolchain, save for the additional target name prepended to every command you are used to. Instead of invoking gcc and objdump for your target, you will need to invoke i386-linux-gcc and i386-linux-objdump.

The following is a Makefile for the control daemon on the DAQ module that provides a good example of the cross-development toolchain’s use:

# Tool names
CROSS_COMPILE = ${TARGET}-
AS            = $(CROSS_COMPILE)as
AR            = $(CROSS_COMPILE)ar
CC            = $(CROSS_COMPILE)gcc
CPP           = $(CC) -E
LD            = $(CROSS_COMPILE)ld
NM            = $(CROSS_COMPILE)nm
OBJCOPY       = $(CROSS_COMPILE)objcopy
OBJDUMP       = $(CROSS_COMPILE)objdump
RANLIB        = $(CROSS_COMPILE)ranlib
READELF       = $(CROSS_COMPILE)readelf
SIZE          = $(CROSS_COMPILE)size
STRINGS       = $(CROSS_COMPILE)strings
STRIP         = $(CROSS_COMPILE)strip

export AS AR CC CPP LD NM OBJCOPY OBJDUMP RANLIB READELF SIZE STRINGS \
         STRIP

# Build settings
CFLAGS        = -O2 -Wall
HEADER_OPS    =
LDFLAGS       =

# Installation variables
EXEC_NAME     = command-daemon
INSTALL       = install
INSTALL_DIR   = ${PRJROOT}/rootfs/bin

# Files needed for the build
OBJS          = daemon.o

# Make rules
all: daemon

.c.o:
        $(CC) $(CFLAGS) $(HEADER_OPS) -c $<

daemon: ${OBJS}
        $(CC) -o $(EXEC_NAME) ${OBJS} $(LDFLAGS)

install: daemon
        test -d $(INSTALL_DIR) || $(INSTALL) -d -m 755 $(INSTALL_DIR)
        $(INSTALL) -m 755 $(EXEC_NAME) $(INSTALL_DIR)

clean:
        rm -f *.o $(EXEC_NAME) core

distclean:
        rm -f *~
        rm -f *.o $(EXEC_NAME) core

The first part of the Makefile specifies the names of the toolchain utilities we are using to build the program. The name of every utility is prepended with the target’s name. Hence, the value of CC will be i386-linux-gcc, the cross-compiler we built earlier. In addition to defining the name of the utilities, we also export these values so that subsequent Makefiles called by this Makefile will use the same names. Such a build architecture is quite common in large projects with one main directory containing many subdirectories.

The second part of the Makefile defines the build settings. CFLAGS provides the flags to be used during the build of any C file.

As we saw in the previous section, the compiler is already using the correct path to the target’s libraries. The linker flags variable, LDFLAGS, is therefore empty. If the compiler wasn’t pointing to the correct libraries or was using the host’s libraries (which shouldn’t happen if you followed the instructions I provided above), we would have to tell the compiler which libraries to use by setting the link flags as follows:

LDFLAGS       = -nostdlib -L${TARGET_PREFIX}/lib

If you wish to link your application statically, you need to add the -static option to LDFLAGS. This generates an executable that does not rely on any shared library. But given that the standard GNU C library is rather large, this will result in a very large binary. A simple program that uses printf( ) to print “Hello World!”, for example, is less than 12 KB in size when linked dynamically and around 350 KB when linked statically and stripped.

The variables in the installation section indicate what, where, and how to install the resulting binary. In this case, the binary is being installed in the /bin directory of the target’s root filesystem.

In the case of the control daemon, we currently only have one file to build. Hence, the program’s compilation only requires this single file. If, however, you had used the -nostdlib option in LDFLAGS, which you should not normally need to do, you would also need to change the section describing the files required for the build and the rule for generating the binary:

STARTUP_FILES = ${TARGET_PREFIX}/lib/crt1.o \
                ${TARGET_PREFIX}/lib/crti.o \
                ${PREFIX}/lib/gcc-lib/${TARGET}/2.95.3/crtbegin.o
END_FILES     = ${PREFIX}/lib/gcc-lib/${TARGET}/2.95.3/crtend.o \
                ${TARGET_PREFIX}/lib/crtn.o
LIBS          = -lc
OBJS          = daemon.o
LINKED_FILES  = ${STARTUP_FILES} ${OBJS} ${LIBS} ${END_FILES}
...
daemon: ${OBJS}
        $(CC) -o $(EXEC_NAME) ${LINKED_FILES} $(LDFLAGS)

Here, we add five object files to the one we are generating from our own C file, crt1.o, crti.o, crtbegin.o, crtend.o, and crtn.o. These are special startup, initialization, constructor, destructor, and finalization files, respectively, which are usually automatically linked to your applications. It is through these files that your application’s main( ) function is called, for example. Since we told the compiler not to use standard linking in this case, we need to explicitly mention the files. If you do not explicitly mention them while having disabled standard linking, the linker will complain about the missing _start symbol and fail. The order in which the object files are provided to the compiler is important because the GNU linker, which is automatically invoked by the compiler to link the object files, is a one-pass linker.

The make rules themselves are very much the same ones you would find in a standard, native Makefile. I added the install rule to automate the install process. You may choose not to have such a rule, but to copy the executable manually to the proper directory.

With the Makefile and the source file in your local directory, all you need to do is type make to build your program for your target. If you want to build your program for native execution on your host to test your application, for example, you could use the following command line:

$ make CROSS_COMPILE=""

C Library Alternatives

Given the constraints and limitations of embedded systems, the size of the standard GNU C library makes it an unlikely candidate for use on our target. Instead, we need to look for a C library that will have sufficient functionality while being relatively small.

Over time, a number of libraries have been implemented with these priorities in mind. In the following, we will discuss the two most important C library alternatives, uClibc and diet libc. For each library, I will provide background information, instructions on how to build the library for your target, and instructions on how to build your applications using the library.

uClibc

The uClibc library originates from the uClinux project, which provides a Linux that runs on MMU-less processors. The library, however, has since become a project of its own and supports a number of processors that may or may not have an MMU or an FPU. At the time of this writing, uClibc supports all the processor architectures discussed in depth in Chapter 3. uClibc can be used as a shared library on all these architectures, because it includes a native shared library loader for each architecture. If a shared library loader were not implemented in uClibc for a certain architecture, glibc’s shared library loader would have to be used instead for uClibc to be used as a shared library.

Although it does not rely on the GNU C library, uClibc provides most of the same functionality. It is, of course, not as complete as the GNU library and does not attempt to comply with all the standards with which the GNU library complies. Functions and function features that are seldom used, for instance, are omitted from uClibc. Nevertheless, most applications that can be compiled against the GNU C library will also compile and run using uClibc. To this end, uClibc developers focus on maintaining compatibility with C89, C99, and SUSv3.[10] They regularly use extensive test suites to ensure that uClibc conforms to these standards.

uClibc is available for download as a tar-gzipped or tar-bzip2'd archive or by using CVS from the project’s web site at http://uclibc.org/. It is distributed under the terms of the LGPL. An FAQ is available on the project’s web site, and you can subscribe to the uClibc mailing list or browse the mailing list archive if you need help. In the following description, we will be using Version 0.9.16 of uClibc, but the explanation should apply to subsequent versions as well. Versions earlier than 0.9.16 depended on a different configuration system and are not covered by the following discussion.

Library setup

The first step in the setup is to download uClibc and extract it in our ${PRJROOT}/build-tools directory. In contrast to the GNU toolchain components, we will be using the package’s own directory for the build instead of a separate directory. This is mainly because uClibc does not support building in a directory other than its own. The rest of the build process, however, is similar to that of the other tools, with the main steps being configuration, building, and installation.

After extracting the package, we move into the uClibc directory for the setup:

$ cd ${PRJROOT}/build-tools/uClibc-0.9.16

For its configuration, uClibc relies on a file named .config that should be located in the package’s root directory. To facilitate configuration, uClibc includes a configuration system that automatically generates a .config file based on the settings we choose, much like the kernel configuration utility we will discuss in Chapter 5.[11]

The configuration system can be operated in various ways, as can be seen by looking at the INSTALL file included in the package’s directory. The simplest way to configure uClibc is to use the curses-based terminal configuration menu:

$ make CROSS=i386-linux- menuconfig

This command displays a menu that can be navigated using the arrow, Enter, and Esc keys. The main menu includes a set of submenus, which allow us to configure different aspects of uClibc. At the main menu level, the configuration system enables us to load and save configuration files. If we press Esc at this level, we are prompted to choose between saving the configuration to the .config file or discarding it.

In the command above, we set CROSS to i386-linux-, since our cross-platform tools are prepended by this string, as I explained earlier. We could also edit the Rules.mak file and set CROSS to ${TARGET}- instead of specifying CROSS= for each uClibc Makefile target.

The main configuration menu includes the following submenus:

  • Target Architecture Features and Options

  • General Library Settings

  • Networking Support

  • String and Stdio Support

  • Library Installation Options

  • uClibc hacking options

Through its submenus, the configuration system allows us to configure many options. Fortunately, we can obtain information regarding each option using the “?” key. When this key is pressed, the configuration system displays a paragraph explaining how this option is used and provides its default values. There are two types of options: paths for tools and directories needed for building, installing, and operating uClibc, and options for selecting the functionality to be included in uClibc.

We begin by setting the tool and directory paths in the “Target Architecture Features and Options” and “Library Installation Options” submenus. Table 4-7 lists the values we must set in those submenus to have uClibc compile and install in accordance with our workspace. For each option, the name of the variable used internally by uClibc’s configuration system is given in parentheses. Knowing this name is important for understanding the content of the .config file, for example.

Table 4-7. uClibc tool and directory path settings

Option

Setting

Linux kernel header location (KERNEL_SOURCE)

${PRJROOT}/kernel/linux-2.4.18

Shared library loader path (SHARED_LIB_LOADER_PATH)

/lib

uClibc development environment directory (DEVEL_PREFIX)

${PRJROOT}/tools/uclibc

uClibc development environment system directory (SYSTEM_DEVEL_PREFIX)

$(DEVEL_PREFIX)

uClibc development environment tool directory (DEVEL_TOOL_PREFIX)

$(DEVEL_PREFIX)/usr

Notice that we use ${PRJROOT}/tools instead of ${PREFIX}, although the former is the value we gave to the PREFIX environment variable in our script. This is because uClibc’s use of the PREFIX variable in its build Makefiles and related scripts differs from our use. Mainly, it uses this variable to install everything in an alternate location, whereas we use it to point to the main install location.

KERNEL_SOURCE should point to the sources of the kernel you will be using on your target. If you don’t set this properly, your applications may not work at all, because uClibc doesn’t attempt to provide binary compatibility across kernel versions.

SHARED_LIB_LOADER_PATH is the directory where shared libraries will be located on your target. All the binaries you link with uClibc will have this value hardcoded. If you later change the location of your shared libraries, you will need to rebuild uClibc. We have set the directory to /lib, since this is the traditional location of shared libraries.

DEVEL_PREFIX is the directory where uClibc will be installed. As with the other tools, we want it to be under ${PRJROOT}/tools. SYSTEM_DEVEL_PREFIX and DEVEL_TOOL_PREFIX are other installation variables that are used to control the installation of some of the uClibc binaries and are mostly useful for users who want to build RPM or dpkg packages. For our setup, we can set SYSTEM_DEVEL_PREFIX to the same value as DEVEL_PREFIX, and DEVEL_TOOL_PREFIX to $(DEVEL_PREFIX)/usr. This results in all uClibc binaries prepended with the target name, such as i386-uclibc-gcc, to be installed in ${PRJROOT}/tools/uclibc/bin, and all uClibc binaries not prepended with the target name, such as gcc, to be installed in ${PRJROOT}/tools/uclibc/usr/bin. As we shall see later, we only need to add ${PRJROOT}/tools/uclibc/bin to our path to use uClibc.

Let us now take a look at the options found in each configuration submenu. As I said earlier, you can use the “?” key to obtain more information about each option from the configuration system. Because some options depend on the settings of other options, some of the options listed below may not be displayed in your configuration. While most options are either enabled or disabled, some are string fields, such as the paths we discussed earlier, which must be filled.

The “Target Architecture Features and Options” submenu includes the following options:

  • Target Processor Type.

  • Target CPU has a memory management unit (MMU) (UCLIBC_HAS_MMU).

  • Enable floating (UCLIBC_HAS_FLOATS).

  • Target CPU has a floating point unit (FPU) (HAS_FPU).

  • Enable full C99 math library support (DO_C99_MATH).

  • Compiler Warnings (WARNINGS). This is a string field that allows you to set the compiler flags used for reporting warnings.

  • Linux kernel header location (KERNEL_SOURCE). This is the kernel path we discussed earlier.

The “General Library Settings” submenu includes the following options:

  • Generate Position Independent Code (PIC) (DOPIC).

  • Enable support for shared libraries (HAVE_SHARED).

  • Compile native shared library loader (BUILD_UCLIBC_LDSO).

  • Native shared library loader `ldd’ support (LDSO_LDD_SUPPORT).

  • POSIX Threading Support (UCLIBC_HAS_THREADS).

  • Large File Support (UCLIBC_HAS_LFS).

  • Malloc Implementation. This is a submenu that allows us to choose between two malloc implementations, malloc and malloc-930716.

  • Shadow Password Support (HAS_SHADOW).

  • Regular Expression Support (UCLIBC_HAS_REGEX).

  • Supports only Unix 98 PTYs (UNIXPTY_ONLY).

  • Assume that /dev/pts is a devpts or devfs filesystem (ASSUME_DEVPTS).

The “Networking Support” submenu includes the following options:

  • IP Version 6 Support (UCLIBC_HAS_IPV6).

  • Remote Procedure Call (RPC) support (UCLIBC_HAS_RPC).

  • Full RPC support (UCLIBC_HAS_FULL_RPC).

The “String and Stdio support” submenu includes the following options:

  • Wide Character Support (UCLIBC_HAS_WCHAR).

  • Locale Support (UCLIBC_HAS_LOCALE).

  • Use the old vfprintf implementation (USE_OLD_VFPRINTF).

We already covered all the options in the “Library Installation Options” submenu earlier in this section. Here they are nevertheless for completeness:

  • Shared library loader path (SHARED_LIB_LOADER_PATH).

  • uClibc development environment directory (DEVEL_PREFIX).

  • uClibc development environment system directory (SYSTEM_DEVEL_PREFIX).

  • uClibc development environment tool directory (DEVEL_TOOL_PREFIX).

Though you should not normally need to enter the “uClibc hacking options” submenu, here are the options it includes:

  • Build uClibc with debugging symbols (DODEBUG).

  • Build uClibc with runtime assertion testing (DOASSERTS).

  • Build the shared library loader with debugging support (SUPPORT_LD_DEBUG).

  • Build the shared library loader with early debugging support (SUPPORT_LD_DEBUG_EARLY).

For our DAQ module, we left the options to their default values. For most targets, you should not need to change the options either. Remember that you can always revert to the defaults by removing the .config file from the uClibc’s directory.

With uClibc now configured, we can compile it:

$ make CROSS=i386-linux-

The compilation takes approximately 10 minutes in our setup. As with the GNU toolchain, you may see warnings during the build that you can safely ignore.

With the build complete, we can install uClibc:

$ make CROSS=i386-linux- PREFIX="" install

Given the values we set above, this will install all the uClibc components in the ${PRJROOT}/tools/uclibc directory. If we had already installed uClibc, the installation procedure will fail while trying to copy files to the ${PRJROOT}/tools/uclibc directory. In such a case, we should erase the content of that directory before issuing the make install command.

Usage

We are now ready to link our applications with uClibc instead of the GNU C library. To facilitate this linking, a couple of utilities have been installed by uClibc in ${PRJROOT}/tools/uclibc/bin. Mainly, uClibc installed an alternate compiler and alternate linker, i386-uclibc-gcc and i386-uclibc-ld. Instead of using the i386-linux- prefix, the utilities and symbolic links installed by uClibc have the i386-uclibc- prefix. Actually, the uClibc compiler and linker are wrappers that end up calling the GNU utilities we built earlier while ensuring that your application is properly built and linked with uClibc.

The first step in using these utilities is to amend our path:

$ export PATH=${PREFIX}/uclibc/bin:${PATH}

You will also want to modify your development environment script to automate this path change. In the case of develdaq, here is the new line for the path:

export PATH=${PREFIX}/bin:${PREFIX}/uclibc/bin:${PATH}

Using the same Makefile as earlier, we can compile the control daemon as follows:

$ make CROSS_COMPILE=i386-uclibc-

Since uClibc is a shared library by default on the x86, this will result in a dynamically linked binary. We could still compile our application statically, however:

$ make CROSS_COMPILE=i386-uclibc- LDFLAGS="-static"

The same “Hello World!” program we used earlier is only 2 KB in size when linked with the shared uClibc and 18 KB when linked statically with it. This is a big difference with the figures I gave above for the same program when it was linked with glibc.

Diet libc

The diet libc project was started and is still maintained by Felix von Leitner with aims similar to uClibc. In contrast with uClibc, however, diet libc did not grow from previous work on libraries but was written from scratch with an emphasis on minimizing size and optimizing performance. Hence, diet libc compares quite favorably to glibc in terms of footprint and in terms of code speed. In comparison to uClibc, though, I have not noticed any substantial difference.

Diet libc does not support all the processor architectures discussed in Chapter 3. It supports the ARM, the MIPS, the x86, and the PPC. Also, the authors of diet libc favor static linking over dynamic linking. So, although diet libc can be used as a shared library on some platforms, it is mostly intended to be used as a static library.

One of the most important issues to keep in mind while evaluating diet libc is its licensing. In contrast to most other libraries, including uClibc, which are usually licensed under the LGPL, diet libc is licensed under the terms of the GPL. As I explained in Chapter 1, this means that by linking your code to diet libc, the resulting binary becomes a derived work and you can distribute it only under the terms of the GPL. A commercial license is available from the package’s main author if you wish to distribute non-GPL code linked with diet libc.[12] If, however, you would prefer not to have to deal with such licensing issues, you may want to use uClibc instead.

Diet libc is available for download both as a tar-bzip2'd archive or using CVS from the project’s web site at http://www.fefe.de/dietlibc/.[13] The package comes with an FAQ and installation instructions. In the following, we will be using Version 0.21 of diet libc, but my explanations should also apply to previous and subsequent versions.

Library setup

As with uClibc, the first step to setting up diet libc is to download it into our ${PRJROOT}/build-tools directory. Here too, we will build the library within the package’s source directory and not in another directory as was the case for the GNU toolchain. Also, there is no configuration required for diet libc. Instead, we can proceed with the build stage immediately.

Once the package is extracted, we move into the diet libc directory for the setup:

$ cd ${PRJROOT}/build-tools/dietlibc-0.21

Before building the package for our target, we will build it for our host. This is necessary to create the diet utility, which is required to build diet libc for the target and later to build applications against diet libc:

$ make

In our setup, this creates a bin-ppc directory containing a PPC diet libc. We can now compile diet libc for our target:

$ make ARCH=i386 CROSS=i386-linux-

`You will see even more warnings than with the other packages, but you can ignore them. Here, we must tell the Makefile both the architecture for which diet libc is built and the prefix of the cross-platform development tools.

With the package now built, we can install it:

$ make ARCH=i386 DESTDIR=${PREFIX}/dietlibc prefix="" install

This installs diet libc components in ${PREFIX}/dietlibc. Again, as when building the package for our target, we provide the Makefile with the architecture. We also specify the install destination using the DESTDIR variable and reset the Makefile’s internal prefix variable, which is different from the capital PREFIX environment variable.

Diet libc has now been installed in the proper directory. There is, however, one correction we need to make to diet libc’s installation. By installing the x86 version of diet libc, we installed the x86 version of the diet utility in ${PREFIX}/dietlibc/bin. Since we intend to compile our applications on the host, we need to overwrite this with the native diet utility we built earlier:

$ cp bin-ppc/diet ${PREFIX}/dietlibc/bin

Usage

As with uClibc, using diet libc involves modifying the path and using the wrapper provided by diet libc to link our applications. In contrast to uClibc, however, instead of substituting the cross-development tools with tools specific to the library, we only need to prepend the calls we make to the tools with the diet libc wrapper.

First, we must change our path to include the directory containing the diet libc binary:

$ export PATH=${PREFIX}/dietlibc/bin:${PATH}

Again, you will also want to change your development environment script. For example, the path line in our develdaq script becomes:

export PATH=${PREFIX}/bin:${PREFIX}/dietlibc/bin:${PATH}

Notice that I assume that you won’t be using both uClibc and diet libc at the same time. Hence, the path line has only diet libc added to it. If you would like to have both diet libc and uClibc on your system during development, you need to add both paths.

To compile the control daemon with diet libc, we use the following command line:

$ make CROSS_COMPILE="diet i386-linux-"

Since diet libc is mainly a static library, this will result in a statically linked binary by default and you don’t need to add LDFLAGS="-static" to the command line. Using the same “Hello World!” program as earlier, I obtain a 24 KB binary when linked with diet libc.

Java

Since its introduction by Sun in 1995, Java™ has become one of the most important programming languages around. Today, it is found in every category of computerized systems, including embedded systems. Although still not as popular as C in the embedded programming world, it is nonetheless being used in an ever-increasing number of designs.

I will not attempt to introduce you to Java or any of the technology surrounding it. Instead, I refer you to the plethora of books on the matter, including many by O’Reilly. There is, nonetheless, one basic issue we need to review before continuing. Essentially, any discussion on Java involves a discussion of three different items: the Java programming language, the Java Virtual Machine (JVM), and the Java Runtime Environment (JRE), which is made up of the various Java classes.

There are many packages, both free and proprietary, that provide Java functionality in Linux. In our discussion, we will concentrate on the freely available packages. Specifically, we will discuss the Blackdown project, the open source virtual machines, and the GNU Compiler for the Java programming language. I will not cover the installation or the use of these tools as there is little difference between installing and using them on a Linux workstation and in an embedded Linux system. I will, nonetheless, refer you to the appropriate documentation for such instructions.

The Blackdown Project

The Blackdown project (http://www.blackdown.org/) is the group that ports Sun’s Java tools to Linux. This effort is entirely based on Sun’s own Java source code and provides Linux ports of Sun’s tools, including the Java Development Kit (JDK) and the JRE. This is the JDK and JRE most often used in Linux workstations and servers.

This project has enjoyed a privileged, and sometimes troubled, relationship with Sun. Since this project is entirely based on Sun source code and this code is not available as open source,[14] it is entirely dependent on Sun’s goodwill to help the Linux community.

Actually, the Blackdown project does not distribute any source code. Instead, it distributes prebuilt binaries for the various processor architectures to which its developers have ported Sun’s Java tools. As the project’s FAQ points out, you need to contact Sun to get access to the source code.

According to the licensing agreements between Sun and Blackdown, you are allowed to download the JDK for your own use, but you cannot distribute it without entering into an agreement with Sun. You can, however, download the JRE and distribute it as-is with few limitations.

Before releasing new versions of their work, the Blackdown team must meet the requirements of Sun’s compatibility tests. Hence, consecutive Blackdown releases do not necessarily support all the architectures of the previous releases. Release 1.3.0-FCS, for instance, supports the PPC and the x86, while 1.3.1-rc1 supports only the ARM. The complete list of Blackdown releases and supported platforms is available from the project’s status page at http://www.blackdown.org/java-linux/ports.html.

To run the JDK or the JRE, you will need glibc, at the very least, and the X Window System with its libraries if you wish to use the AWT classes. Given the constraints of most embedded systems, only those with very large amounts of storage and processing power will be able to accommodate this type of application.

For more information regarding the Blackdown project, the tools it provides, how to install them, how to operate them, and the licensing involved, see the Blackdown FAQ at http://www.blackdown.org/java-linux/docs/support/faq-release/.

Open Source Virtual Machines

Given Blackdown’s hurdles and its dependence on Sun, a number of projects have been started to provide open source, fully functional JVMs, without using any of Sun’s source code. The most noteworthy one is Kaffe.

Since there isn’t any consensus on the feasibility of using any of the various open source VMs as the main JVM in an embedded Linux project, I will only mention the VMs briefly and will not provide any information regarding their use. You are invited to look at each VM and follow the efforts of the individual teams.

The Kaffe Java Virtual Machine (http://www.kaffe.org/) is based on a product sold commercially by Transvirtual Inc., KaffePro VM, and is a clean-room implementation of the JVM.[15] Although no new releases of the project have been made since July 2000 and although this VM is not 100% compatible with Sun’s VM, according to the project’s web site, it is still the main open source alternative to Sun’s VM.

There are other projects that may eventually become more important, such as Japhar (http://www.japhar.org/), Kissme (http://kissme.sourceforge.net/), Aegis (http://aegisvm.sourceforge.net/), and Sable VM (http://www.sablevm.org/). For a complete list of open source VM projects, see the list provided by yet another open source VM project, the joeq VM (http://joeq.sourceforge.net/), at http://joeq.sourceforge.net/other_os_java.htm. See each project’s respective web site and documentation for information on how to install and operate the VM.

The GNU Java Compiler

As part of the GNU project, the GNU Compiler for the Java programming language (gcj) is an extension to gcc that can handle both Java source code and Java bytecode. In particular, gcj can compile either Java source code or Java bytecode into native machine code. In addition, it can also compile Java source into Java bytecode. It is often referred to as an ahead-of-time (AOT) compiler, because it can compile Java source code directly into native code, in contrast with popular just-in-time (JIT) compilers that convert Java bytecode into native code at runtime. gcj does, nevertheless, include a Java interpreter equivalent to the JDK’s java command.

GNU gcj is a fairly active project, and most core Java class libraries are already available as part of the gcj runtime libraries. Although most windowing components, such as AWT, are still under development, the compiler and its runtime environment can already be used to compile and run most command-line applications.

As with other GNU projects, gcj is fairly well documented. A good starting place is the project’s web site at http://gcc.gnu.org/java/. In the documentation section of the web site, you will find a compile HOWTO, a general FAQ, and instructions on how to debug Java applications with gdb. You should be able to use the compile HOWTO in conjunction with my earlier instructions regarding the GNU toolchain to build gcj for your target.

Perl

Perl was introduced by Larry Wall in 1987. This programming language has since become a world of its own. If you are interested in Perl, have a look at Wall, Christiansen, and Orwant’s Programming Perl or Schwartz’s Learning Perl (both published by O’Reilly). Briefly, Perl is an interpreted language whose compiler, tools, and libraries are all available as open source under the terms of the Perl Artistic License and the GNU GPL from the Comprehensive Perl Archive Network (CPAN) at http://www.cpan.org/. Since there is only one Perl toolset, you will not need to evaluate different toolsets to figure out which one best suits your needs.

The main component you will need to run Perl programs on your target is a properly compiled Perl interpreter for your target. Unfortunately, at the time of this writing, Perl is not well adapted to cross-compilation. Efforts are, however, underway to solve the underlying issues. According to Jarkko Hietaniemi, the 5.8 release manager, Perl 5.8.0, should be able to cross-compile itself. For the time being, the 5.7 development branch includes two build options for cross-compiling small versions of the full Perl package: microperl and miniperl. Note that both options are part of the same package and you do not need to download any other package than the one provided by CPAN.

Microperl

The microperl build option was implemented by Simon Cozens based on an idea by Ilya Zakhareivh. It is the absolute bare minimum build of Perl with no outside dependencies other than ANSI C and the make utility. Unlike the other builds, microperl does not require that you run the Configure script, which performs a great deal of tests on the installation machine before generating the appropriate files for the package’s build. Instead, default configuration files are provided with the bare minimum settings that allow the core Perl interpreter to build properly. None of the language’s core features are missing from this interpreter. Of course it does not support all the features of the full interpreter, but it is sufficient to run basic Perl applications. Since this code is considered “experimental,” for the moment, you will need to evaluate most of microperl’s capabilities on your own.

I have successfully built a microperl for my DAQ module using the toolchain set up earlier, uClibc, and Perl 5.7.3. The resulting interpreter was able to adequately execute all Perl programs that did not have any outside references. It failed, however, to run programs that used any of the standard Perl modules.

To build microperl for your target, you must first download a Perl version from CPAN and extract it into the ${PRJROOT}/sysapps directory. Place the package in the sysapps directory, because it will run only on the target and will not be used to build any of the other software packages for your target. With the package extracted, we move into its directory for the build. Here, we cannot use a different build directory, as we did for the GNU toolchain, because Perl does not support this build method.

$ cd ${PRJROOT}/sysapps/perl-5.7.3

Since microperl is a minimal build of Perl, we do not need to configure anything. We can build the package by using the appropriate Makefile and instructing it to use the uClibc compiler wrapper instead of the standard gcc compiler:

$ make -f Makefile.micro CC=i386-uclibc-gcc

This will generate a microperl binary in the package’s root directory. This binary does not require any other Perl components and can be copied directly to the /bin directory of your target’s root filesystem, ${PRJROOT}/rootfs.

When dynamically linked with either glibc or uClibc and stripped, the microperl binary is around 900 KB in size. When statically linked and stripped, the binary is 1.2 MB in size with glibc, and 930 KB with uClibc. As you can see, uClibc is the better choice in this case for size reasons.

For more information on how microperl is built, have a look at the Makefile.micro Makefile and the uconfig.sh script. As work continues on microperl, it is expected that more documentation will become available.

Miniperl

Miniperl is less minimalistic than microperl and provides most of what you would expect from the standard Perl interpreter. The main component it lacks is the DynaLoader XS module, which allows Perl subroutines to call C functions. It is therefore incapable of loading XS modules dynamically. This is a minor issue, however, given the type of system miniperl will be running on.

As with the main Perl build, miniperl requires that you run the Configure script to determine the system’s capabilities. Since the system for which Perl must be built is your target, the script requires you to provide it with information regarding the means it should use to communicate with that target. This includes a hostname, a remote username, and a target-mapped directory. It will then use this information to run its tests on your target to generate the proper build files.

The main caveat concerning this method is its reliance on the existence of a direct network link between the host and the target. In essence, if your target does not have some form of networking, you will be unable to build miniperl for it.

I will not provide the details of the build and installation methodology for miniperl, as it is already very well explained in the INSTALL file provided with the 5.7.3 Perl package under the “Cross-compilation” heading.

Python

Python was introduced to the public by Guido van Rossum in 1991. It has since gathered many followers and, as with Perl, is a world of its own. If you are interested in Python, read Mark Lutz’s Programming Python or Lutz, Ascher, and Willison’s Learning Python (both published by O’Reilly). Python is routinely compared to Perl, since it often serves the same purposes, but because this is the subject of yet another holy war, I will not go any further. Instead, feel free to browse the main Python web site at http://www.python.org/ for more information on the world of Python. The Python package, which includes the Python interpreter and the Python libraries, is available from that web site under the terms of a composite license called the Python license, which is an approved open source license.

As with Perl, you will need a properly configured interpreter to run Python code on your target. Although the main Python distribution does not support cross-compiling, a patch has been developed to this effect by Klaus Reimer and is available from http://www.ailis.de/~k/patches/python-cross-compile.diff. Klaus also provides a very well written Python cross-compiling HOWTO at http://www.ailis.de/~k/knowledge/crosscompiling/python.php.

You can follow Klaus’ instructions to build Python for your target while using the appropriate names for your target instead of the arm-linux used in the instructions. To follow the same project workspace organization that we established earlier, download and extract the Python package in the ${PRJROOT}/sysapps directory. Also, instead of building Python directly in its source directory, you can use a build-python directory, as we did with the GNU tools, since Python supports this build method. In addition, use the - -prefix=${PREFIX}/${TARGET}/usr option instead of the values provided by the HOWTO. All the Python material will thereby be installed in the ${PREFIX}/${TARGET}/usr directory. This directory can then be customized and copied onto the target’s root filesystem.

There are a couple of observations to be made about the resulting package. First, you will not be able to build Python with diet libc. You will need to build Python against glibc or uClibc. This means that glibc or uClibc will have to be on your target’s root filesystem. When storage space on your target is limited, I suggest you use uClibc instead of glibc. Also, if you want to build Python against uClibc, you need to patch Python using the patch posted by Manuel Novoa on August 27, 2002 on the uClibc mailing list following the announcement of uClibc 0.9.15.

Second, Python has installed many libraries in the ${PREFIX}/${TARGET}/usr/lib/python2.2 directory, and many of those are large. You may want to trim down the content of this directory by deleting the components you are unlikely to use. By itself, the dynamically linked and stripped Python interpreter is 725 KB in size.

Nevertheless, Python’s size and dependencies have not stopped developers from using it. The team developing the iPAQ’s Familiar distribution, for instance, includes it as part of their standard packages.

Finally, as Klaus explains, you may see some warnings and failures during the build. This is because some libraries and applications are missing on your target. The Tkinter interface to libtk.a and libtcl.a will fail to build, for instance, unless you had cross-compiled and installed Tcl/Tk for your target. This doesn’t mean the Python build has failed. Rather, it is an indication that one of the Python components has not built successfully. You will still be able to install and use the Python interpreter and the modules that built properly on your target.

Ada

Ada was sponsored by the U.S. Department of Defense (DoD). During the 1970s, the DoD realized that it had a huge software maintenance problem on its hands. Thus, it started work on a new programming language that met its stringent requirements of code maintainability and reliability. Ada was first standardized by ANSI in 1983 and was later updated in 1995 with the release of the Ada95 standard.

Work on a gcc-based Ada compiler was started at New York University and resulted in gnat, the GNU Ada compiler.[16] Work on gnat continued at Ada Core Technologies Inc. (ACT), which maintained it for some time before it was eventually integrated into the main gcc source tree. Every so often, ACT used to release a GPL copy of its most recent work and made it available, along with some prebuilt binaries, at ftp://cs.nyu.edu/pub/gnat/. Their latest release, gnat 3.14p, required gcc 2.8.1 to build. To be precise, gnat’s source was provided with a patch that had to be applied to the gcc sources, and an ada directory that had to be copied into gcc’s source directory.

Unfortunately, this led to all sorts of problems. For instance, gcc 2.8.1 was fairly outdated and most gcc versions found in recent distributions failed to build it properly. Hence, if you wanted to use the 3.14p release, you first had to install an old compiler on your system and use it to build gnat. Obviously, this wasn’t an endearing prospect.

More recently, ACT’s work on gnat has been integrated into the gcc CVS and is now part of gcc 3.2. Though you still need a gnat binary to build the Ada compiler, the integration of gnat into mainstream gcc is likely to simplify the use of Ada in embedded Linux systems in the future.

Apart from the ongoing effort to integrate gnat into mainstream gcc, there are two online projects you may find helpful if you are interested in Ada programming in Linux. First, The Big Online Book of Linux Ada Programming is a collective work started by Ken Burtch with the main goal of providing a complete online reference manual for Ada programming in Linux. The manual is available at http://www.pegasoft.ca/homes/book.html and has a couple of mirrors.

Second, the Ada for GNU/Linux Team (ALT) provides a number of ACT-independent binary packages, RPMs, and patches at http://www.gnuada.org/alt.html. The group also provides a number of links to packages providing Ada interfaces and bindings to popular libraries, such as GTK, XML, and X11.

Other Programming Languages

There are, of course, many more programming languages supported in Linux. Whether you are looking for programming in Forth, Lisp, or FORTRAN, a short search on the Net with your favorite search engine should yield rapid results. A good starting point is the "Other Languages" section in Chapter 13 of Running Linux (O’Reilly).

The cross-compiling and cross-development capabilities of the various language tools will need to be evaluated on a tool-to-tool basis, since few compilers and interpreters lend themselves well to cross-platform development.

Integrated Development Environments

Many integrated development environments (IDEs) are available for Linux. Most of these IDEs are usually used to develop native applications. Nevertheless, they can be customized for cross-development by setting the appropriate compiler names in the IDE’s configuration. Table 4-8 provides a list of open source IDEs, their locations, and the list of embedded Linux-relevant programming languages they support.

Table 4-8. Open source IDEs

IDE

Location

Supported languages

Anjuta

http://anjuta.sourceforge.net/

Ada, bash, C, C++, Java, make, Perl, Python

Eclipse

http://www.eclipse.org/

C, C++, Java

Glimmer

http://glimmer.sourceforge.net/

Ada, bash, C, C++, Java, make, Perl, Python, x86 assembly

KDevelop

http://www.kdevelop.org/

C, C++, Java

SourceNavigator

http://sources.redhat.com/sourcenav/

C, C++, Java, Python

I am reluctant to recommend any particular IDE, because the choice is very much a personal one. I personally prefer XEmacs and the command line to any IDE. Others, still, prefer plain-old vi. You may want to look at the screenshots sections of each project to get an initial appreciation for it. Ultimately, however, you may wish to download the IDEs and try them to make up your mind.

In terms of popularity, KDevelop is probably the most popular IDE of the list. Although it is very much oriented towards native development of user applications, it can be customized for cross-development. Anjuta is a very active project, and its interface resembles that of many popular Windows IDEs. SourceNavigator is an IDE made available by Red Hat under the terms of the GPL, and is part of Red Hat’s GNUPro product. Glimmer is a Gnome-based IDE with capabilities similar to the other IDEs. Eclipse is an ambitious project to create an IDE framework that can easily be extended using plug-ins. It was initiated and is still backed by many companies, including IBM, HP, Red Hat, and SuSE.

For more information regarding these projects, visit their web sites and have a look at their documentation.

Terminal Emulators

The most common way to communicate with an embedded system is to use a terminal emulation program on the host to communicate through an RS232 serial port with the target. Although there are a few terminal emulation programs available for Linux, not all are fit for all uses. There are known problems between minicom and U-Boot, for instance, during file transfers over the serial port. Hence, I recommend that you try more than one terminal application to communicate with your target. If nothing else, you are likely to discover one that best fits your personal preferences. Also, see your bootloader’s documentation for any warnings regarding any terminal emulator.

Three main terminal emulators are available in Linux: minicom, cu, and kermit. The following sections cover the setup and configuration of these tools, but not their use. Refer to each package’s documentation for the latter.

Accessing the Serial Port

Before you can use any terminal emulator, you must ensure that you have the appropriate access rights to use the serial port on your host. In particular, you need read and write access to the serial port device, which is /dev/ttyS0 in most cases, and read and write access to the /var/lock directory. Access to /dev/ttyS0 is required to be able to talk to the serial port. Access to /var/lock is required to be able to lock access to the serial port. If you do not have these rights, any terminal emulator you use will complain at startup.[17]

The default permission bits and group settings for /dev/ttyS0 vary between distributions, and sometimes between releases of the same distribution. On Red Hat 6.2, for example, it used to be accessible in read and write mode to the root user only:

$ ls -al /dev/ttyS0
crw-------    1 root     tty        4,  64 May  5  1998 /dev/ttyS0

As with /dev/ttyS0, the permission bits and group settings for /var/lock largely depend on the distribution. For the same Red Hat 6.2, /var/lock was accessible to the root user and any member of the uucp group:

$ ls -ld /var/lock
drwxrwxr-x    5 root     uucp         1024 Oct  2 17:14 /var/lock

Though Red Hat 6.2 is outdated, and your distribution is likely to have different values, this setup is a perfect example to illustrate the modifications required to allow proper access to the serial port. In this case, to use a terminal emulator on the serial port as a normal user, you must be part of both the tty and uucp groups, and access rights to /dev/ttyS0 must be changed to allow read and write access to members of the owning group. In some distributions, the access rights to /dev/ttyS0 will be set properly, but /var/lock will belong to the root group. In that case, you may want to change the group setting, unless you want to allow normal users in the root group, which I do not recommend.

Going back to Red Hat 6.2, use chmod to change the rights on /dev/ttyS0:

$ su
Password:
# chmod 660 /dev/ttyS0
# ls -al /dev/ttyS0
crw-rw----    1 root     tty        4,  64 May  5  1998 /dev/ttyS0

Then, edit the /etc/group file using vigr [18] and add your username to the uucp and tty lines:

...
tty:x:5:karim
...
uucp:x:14:uucp,karim
...

Finally, log out from root user mode, log out from your own account, and log back in to your account:

# exit
$ id
uid=501(karim) gid=501(karim) groups=501(karim)
$ exit

Teotihuacan login: karim
Password:
$ id
uid=501(karim) gid=501(karim) groups=501(karim),5(tty),14(uucp)

As you can see, you need to first log out and then log back in for the changes to take effect. Opening a new terminal window in your GUI may have similar effects, depending on the GUI your are using and the way it starts new terminal windows. Even if it works, however, only the new terminal window will be part of the appropriate groups, but any other window opened before the changes will still be excluded. For this reason, it is preferable to exit your GUI, completely log out, and then log back in.

For more information on the setup of the serial interface, have a look at the Serial HOWTO available from the LDP and Chapter 4 of the Linux Network Administrator’s Guide (O’Reilly).

Minicom

Minicom is the most commonly used terminal emulator for Linux. Most documentation online or in print about embedded Linux assumes that you are using minicom. However, as I said above, there are known file transfer problems between minicom and at least one bootloader. Minicom is a GPL clone of the Telix DOS program and provides ANSI and VT102 terminals. Its project web site is currently located at http://www.netsonic.fi/~walker/minicom.html. Minicom is likely to have been installed by your distribution. You can verify this by using rpm -q minicom if you are using a Red Hat-based distribution.

Minicom is started by using the minicom command:

$ minicom

The utility starts in full-screen mode and displays the following on the top of the screen:

Welcome to minicom 1.83.0
 
OPTIONS: History Buffer, F-key Macros, Search History Buffer, I18n
Compiled on Mar  7 2000, 06:12:31.
 
Press CTRL-A Z for help on special keys

To enter commands to minicom, press Ctrl-A and then the letter of the desired function. As stated by minicom’s welcome message, use Ctrl-A Z to get help from minicom. Refer to the package’s manpage for more details about its use.

UUCP cu

Unix to Unix CoPy (UUCP) used to be one of the most popular ways to link Unix systems. Though UUCP is rarely used today, the cu command part of the UUCP package can be used to call up other systems. The connection used to communicate to the other system can take many forms. In our case, we are mostly interested in establishing a terminal connection over a serial line to our target.

To this end, we must add the appropriate entries to the configuration files used by UUCP. In particular, this means adding a port entry in /etc/uucp/port and a remote system definition to /etc/uucp/sys. As the UUCP info page states, “a port is a particular hardware connection on your computer,” whereas a system definition describes the system to connect to and the port used to connect to it.

Though UUCP is available from the GNU FTP site under the terms of the GPL, it is usually already installed on your system. On a Red Hat-based system, use rpm -q uucp to verify that it is installed.

Here is an example /etc/uucp/port:

# /etc/uucp/port - UUCP ports
# /dev/ttyS0
port      ttyS0       # Port name
type      direct      # Direct connection to other system
device    /dev/ttyS0  # Port device node
hardflow  false       # No hardware flow control
speed     115200      # Line speed

This entry states that there is a port called ttyS0 that uses direct 115200 bps connections without hardware flow control to connect to remote systems through /dev/ttyS0. The name of the port in this case, ttyS0, is used only to identify this port definition for the rest of UUCP utilities and configuration files. If you’ve used UUCP before to connect using a traditional modem, you will notice that this entry resembles modem definitions. Unlike modem definitions, however, there is no need to provide a carrier field to specify whether a carrier should be expected. Setting the connection type to direct makes carrier default to false.

Here is an example /etc/uucp/sys file that complements the /etc/uucp/port file listed earlier:

# /etc/uucp/sys - name UUCP neighbors
# system: target
system  target   # Remote system name
port    ttyS0    # Port name
time    any      # Access is possible at any time

Basically, this definition states that the system called target can be called up at any time using port ttyS0.

We can now use cu to connect to the target:

$ cu target
Connected.

Once in a cu session, you can issue instructions using the ~ character followed by another character specifying the actual command. For a complete list of commands, use ~?.

For more information on how to configure and customize UUCP for your system, have a look at Chapter 16 in the Linux Network Administrator’s Guide (O’Reilly), the UUCP HOWTO available from the LDP, and the UUCP info page.

C-Kermit

C-Kermit is one of the packages maintained as part of Columbia University’s Kermit project (http://www.columbia.edu/kermit/). C-Kermit provides a unified interface for network operations across a wide range of platforms. Although it features many capabilities, terminal emulation is the package’s capability we are most interested in.

Though you are free to download C-Kermit for personal and internal use, C-Kermit is not open source software and its licensing makes it difficult for commercial distributions to include it.[19] C-Kermit is available for download from http://www.columbia.edu/kermit/ckermit.html. Follow the documentation in the ckuins.txt file included with the package to compile and install C-Kermit. In contrast with most other tools we discuss in this book, C-Kermit should be installed system wide, not locally to your project workspace. Once installed, C-Kermit is started using the kermit command.

In terms of usability, kermit compares quite favorably to both minicom and cu. Despite its lack of user menus, as provided by minicom, kermit’s interactive command language provides a very intuitive and powerful way of interacting with the terminal emulator. When you initiate a file transfer from the target’s bootloader, for example, the bootloader starts waiting for the file. You can then switch to kermit’s interactive command line on the host using Ctrl-\ C and send the actual file using the send command. Among other things, the interactive command line provides Tab filename completion similar to that provided by most shells in Linux. Also, the interactive command line is capable of recognizing commands using the shortest unique character string part of a command name. The set receive command, for example, can be shortened to set rec.

To use the kermit command, you must have a .kermrc configuration file in your home directory. This file is run by kermit at startup. Here is an example .kermrc file that I use on my workstation:

; Line properties
set modem type             none ; Direct connection
set line             /dev/ttyS0 ; Device file
set speed                115200 ; Line speed
set carrier-watch           off ; No carrier expected
set handshake              none ; No handshaking
set flow-control           none ; No flow control

; Communication properties
robust                          ; Most robust transfer settings macro
set receive packet-length  1000 ; Max pack len remote system should use
set send packet-length     1000 ; Max pack len local system should use
set window                   10 ; Nbr of packets to send until ack

; File transfer properties
set file type            binary ; All files transferred are binary
set file names          literal ; Don't modify filenames during xfers

For more information about each of the settings above, try the help command provided by kermit’s interactive command line. For more information regarding the robust macro, for example, use help robust. In this case, robust must be used before set receive, since robust sets the maximum packet length to be used by the remote system to 90 bytes, while we want it set to 1000 bytes.

Once the configuration file is created, you can start kermit:

$ kermit -c
Connecting to /dev/ttyS0, speed 115200
 Escape character: Ctrl-\ (ASCII 28, FS): enabled
Type the escape character followed by C to get back,
or followed by ? to see other options.
----------------------------------------------------

If you are looking for more information about the use of C-Kermit and intend to use it more extensively, think about purchasing the Using C-Kermit book by Frank Da Cruz and Christine Gianone (Digital Press). Apart from providing information regarding the use of C-Kermit, sales of the book help fund the project. Though the book covers Version 6.0, supplements for Versions 7.0 and 8.0 are freely available from the project’s web site.



[1] All commands used in this book assume the use of the sh or bash shell, because these are the shells most commonly used. If you use another shell, such as csh, you may need to modify some of the commands accordingly.

[2] The following email from the glibc developer mailing list covers the folding of glibc-crypt into the main glibc package and conformance to U.S. export laws: http://sources.redhat.com/ml/libc-alpha/2000-02/msg00104.html. This email, and the ensuing thread, refer to the “BXA” abbreviation. This is the Bureau of Industry and Security of the U.S. Department of Commerce (http://www.bxa.doc.gov/). It is known as the BXA, because it was formerly the Bureau of Export Administration.

[7] In reference to the fact that Canada had three national parties at the time a name was needed for this procedure.

[8] In some countries, there are local national mirrors, which may be preferable for you to use instead of the main U.S. site. These mirrors’ URLs are usually in the http://www.COUNTRY .kernel.org/ form. http://www.it.kernel.org/ and http://www.cz.kernel.org/ are two such mirrors.

[9] Practically speaking, the build system is our development host.

[10] Single UNIX Specification Version 3.

[11] The uClibc configuration system is actually based on Roman Zippel’s kernel configuration system, which was included in the 2.5 development series.

[12] It is not clear whether this license covers the contributions made to diet libc by developers other than the main author.

[13] Notice the final “/”. If you omit this slash, the web server will be unable to locate the web page.

[14] The source code for Sun’s Java tools is available under the terms of the Sun Community Source License (SCSL). The SCSL is not one the licenses approved by the Open Source Initiative (OSI). See http://opensource.org/licenses/ for the complete list of approved licenses.

[15] That is, it was written from scratch without using any of Sun’s Java source code.

[16] Remarkably, gnat is entirely written in Ada.

[17] The actual changes required for your distribution may differ from those discussed in this section. Refer to your distribution’s documentation in case of doubt.

[18] This command is tailored for the editing of the /etc/group file. It sets the appropriate locks to ensure that only one user is accessing the file at any time. See the manpage for more information.

[19] Although the license was changed lately to simplify inclusion in commercial distributions such as Red Hat, C-Kermit has yet to be included in most mainstream distributions.

Get Building Embedded Linux Systems now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.