Several color models (or color spaces) have been created over the years for different applications. For our purposes, they can be divided into two areas: additive color spaces (e.g., RGB) and subtractive color spaces (e.g., CMYK). Because many of the toolkits used to manipulate graphics expose the underlying color model, it pays to know your way around RGB and CMYK before you delve into graphics programming.

The RGB Color Space

The vast majority of computer graphics applications use the RGB (Red, Green, Blue) additive color space because cathode ray computer monitors generate colors using those three components. In the RGB space, the light spectra of varying fractions of the three primary colors, red, green, and blue combine to make new colors (“peach puff,” for example, or “grassy knoll”). These primaries are referred to as channels.

The actual color generated by an RGB display device never exactly matches the perfect model because of differences between devices and variations in the software that controls those devices. However, most graphics display devices are calibrated to be within an acceptable tolerance of each other.

The RGB model is generally represented as a three-dimensional cube, where each axis is one of the three primary colors. In the abstract model, we typically look at a unit color cube, where each axis has intensity values from 0 to 1. The RGB unit cube is shown on the left side of Figure 1-1.

The RGB and CMYK unit color cubes
Figure 1-1. The RGB and CMYK unit color cubes

This unit cube is scaled by the number of bits used to represent each component; e.g., a 24-bit color cube would have components in the range 0–255. Each color in the color space can be represented as a three-value coordinate (or vector), for example (255, 127, 54). In HTML and in many of the Perl graphics modules described in this book, we represent an RGB color as three 8-bit hexadecimal values concatenated together. Bright purple, for example, would be:


The diagonal of the color cube running from black (0, 0, 0) to white (1, 1, 1) contains all of the gray values. This lets you manipulate color with vector math. For example, you can perform a simple grayscale conversion by taking the dot product of the RGB and grayscale vectors:

# Where $r, $g, $b are values in the range 0..255

my $gray_value = int(255 * (abs(sqrt($r**2 + $g**2 + $b**2)) /
                            abs(sqrt(255**2 + 255**2 + 255**2))));
print int(100*$gray_value/255), "% gray\n";

The CMYK Color Space

The most common subtractive color space is CMYK (Cyan, Magenta, Yellow, and Black), which is used in offset printing. It is called a subtractive model because the resulting color is composed of those frequencies that are not absorbed as light reflects off of a surface. Cyan ink absorbs red light, magenta absorbs green, and yellow absorbs blue. Thus, the combination of magenta and yellow ink reflects red, the combination of cyan and yellow reflects green, and cyan and magenta produce blue.

In an ideal mathematical world, you can convert from the RGB color space to the CMYK color space by subtracting each of the components from the maximum value of that component (in this case, each component is 8 bits):

my ($c, $m, $y) = map {255-$_} $r, $g, $b;

In the real world, certain practical issues make this more complicated. For one, a black component is usually added to the mix. This is because black is a common color, especially for text. It is much more economical to print one coat of black ink than to print three layers of cyan, magenta, and yellow in perfect registration (it also makes the paper less soggy). And anyone who paints will tell you how difficult it is to mix a true black from primary colors.

The “1 minus RGB” method of converting to CMYK is fine for tasks where exact colors are not necessary. To achieve good color matching, though, you can’t treat CMYK as a completely subtractive model because the translucency of inks, the interaction between inks, and differences in substrates make the conversion nonlinear.

Good color matching schemes require various corrections that are determined by experimentation. Unfortunately, a lot of the study and work on the subject is not in the public domain; most of the information is patented by companies that create “color systems” like Pantone or TRUMATCH. That’s the primary reason why a lot of public domain software (such as the Gimp, described in Chapter 5) lacks the CMYK capabilities of proprietary graphics software.

Color Depth

The average human can distinguish between about 7 million different colors, which can be easily represented in 24 bits. An image with a color depth of 24 bits per pixel (or more) is known as a truecolor image. Each pixel in the image is saved as a group of three bytes, one for each of the red, green, and blue elements of the pixel: 8 red bits + 8 green bits + 8 blue bits = 24 bits. Each of the R, G, and B elements can be represented as one of 256 (28) values, which gives us 2563 or 16,777,216 possible colors. This also means that a 200 × 200 pixel truecolor image saved in an uncompressed format would take up 120 KB for the image data alone, and a 500 × 500 pixel image would take up 750 KB. Both of these images would be too large to put on a web page, which is why the image formats used on the Web are compressed file formats (see the section Compression later in this chapter).

The PNG format allows you to save color images with a depth of up to 48 bits per pixel, or grayscale images at 16 bits per pixel. This is actually beyond the display capacity of most consumer video hardware available today, where 24-bit color is the standard. JPEG also lets you store images with a color depth of up to 36 bits. GIF does not handle truecolor images.

A bitmap is an image with a color depth of 1. That is, each pixel is either the foreground color or the background color. The XBM format is an example of a bitmapped image format. Other images represent colors in less than 24 bits, usually by storing a collection of color values in a separate color table.

Color Tables

An image with a color depth of 8 bits is sometimes called a pseudocolor or indexed color image. Pseudocolor allows at most 256 colors through the use of a palette, which is sometimes also referred to as a color table or a Color Lookup Table (CLUT). Rather than storing a red, green, and blue value for each pixel in the image, an index to an element in the color table (usually an 8-bit index) is stored for each pixel. The color table is usually stored with the image, though many applications should also provide default color tables for images without stored palettes.

To save a truecolor image as a pseudocolor image, you must first quantize it to the size of the palette (256 colors for a GIF or an indexed PNG). Quantization alone usually gives you an image that is unacceptably different from the source image, especially in images with many colors or subtle gradients or shading. To improve the quality of the final image, the quantization process is usually coupled with a dithering process that takes the available colors and tries to approximate the colors in the original by combining the colors in various pixel patterns.

Figure 1-2 shows a 24-bit image (left). That image must be quantized to an optimal 256 colors (the 256 colors that occur most frequently in the image) to save it as an 8-bit indexed image (middle). Dithering is applied with the Floyd-Steinberg dithering process to improve image quality (right).

A 24-bit image (left) quantized to 256 colors (middle) and dithered (right)
Figure 1-2. A 24-bit image (left) quantized to 256 colors (middle) and dithered (right)

The GIF file format is an indexed color file format, and a PNG file can optionally be saved as an indexed color image. A GIF image always has at most 256 colors in its palette. Animated GIFs (multiple images in one file) can have a new palette for each image, so the 256 color limit is applicable to only one image of a multi-image sequence. A PNG may also have a 256 color palette. Even if a PNG image is saved as a 24-bit truecolor image, it may contain a palette for use by applications on platforms without truecolor capability.

You can use the Image::Magick interface described in Chapter 3 for color reduction. Example 1-1 reads in a 24-bit JPEG image and converts it to an indexed GIF. Two output images are created for comparison: a 16-color dithered version, and one without dithering.

Example 1-1. Converting a truecolor image to 16 colors, with and without dithering
#!/usr/bin/perl -w

use strict;
use Image::Magick;

my $image = new Image::Magick;

my $status = $image->Read('24bitimage.jpg');
die "$status\n" if $status;

my $image2 = $image->Clone( );

# Reduce to 16 colors with and without dithering

$image->Quantize(colorspace => 'RGB',
                 colors => 16, dither => 1

$image2->Quantize(colorspace => 'RGB',
                  colors => 16, dither => 0

undef $image;
undef $image2;

Transparency and Alpha

Transparency in graphics allows background colors or background images to show through certain pixels of the image. Often, transparency is used to create images with irregularly shaped borders (i.e., non-square images). The three primary file formats have varying degrees of support for transparency.

Transparency is not currently supported in JPEG files, and it will most likely not be supported in the future because of the particulars of the JPEG compression algorithms and the photography niche at which JPEG is aimed.

The GIF file format handles transparency by allowing you to mark one index in a color table as the transparent color. The display client uses this transparency index when displaying the image; pixels with the same index as the transparency index are simply “left out” when the image is drawn. Each image in a multi-image sequence can have its own transparency index.

The PNG format allows for better transparency support by allowing more space for describing the transparent characteristics of the image, although the full range of its capabilities are not necessarily supported by all web clients. PNG images that contain grayscale or color data at 8 or 16 bits per channel may also contain an alpha channel (also called an alpha mask), which is an additional 8 to 16 bits (depending on the image color depth) that represent the transparency level of each pixel. An alpha level of 0 indicates complete transparency (i.e., the pixel should not be displayed), and an alpha value of 2n − 1 (where n is the color depth) indicates that the pixel should be completely opaque. The values in between indicate a relative level of translucency.

Figure 1-3 shows PNG and GIF transparency. Both images are rectangular, but use transparency to give the impression of a circle. The GIF image (on the left) can only have a well-defined border to the circle because GIF only permits pixels to be drawn or transparent, with no gradation. The PNG image (on the right) has a fuzzy border because PNG permits levels of transparency.

GIF’s 1-bit transparency (left) versus PNG’s full alpha channel (right)
Figure 1-3. GIF’s 1-bit transparency (left) versus PNG’s full alpha channel (right)

Get Perl Graphics Programming now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.