BUY THIS BOOK

Safari Books Online

What is this?

Looking to Reprint this content?


Programming Web Graphics with Perl and GNU Softwar
Programming Web Graphics with Perl and GNU Softwar By Shawn Wallace
February 1999
Pages: 466

Cover | Table of Contents | Colophon


Table of Contents

Chapter 1: Image File Formats
Because graphics files are stored as binary data and are unreadable by humans (actually, parts of graphics files are readable by humans, if you know what you're looking for), most people are intimidated into not looking under the hood at the internals of image file formats. Of course, it is a Good Thing that as a web author you can think of an image as a "black box" that somehow understands its own image-ness. But image file formats are not necessarily inscrutable objects if you really want to know how they work, and understanding the structure of the files that you work with on a daily basis can help you remember the vagaries of image manipulation. Knowing how a GIF file is formatted, for example, will help you answer these questions:
  • Why isn't a GIF with 129 colors smaller than one with 256 colors?
  • Can a multi-image GIF have more than one transparent color?
  • What is the maximum color depth of a GIF?
  • How does a decoder program know that a file is a GIF?
  • How can I make the smallest possible multi-image file?
Hopefully this chapter will help demystify image file formats and help you feel more at home with the binary black boxes called GIFs, PNGs, and JPEGs.
Creating graphics file formats for distribution over variable speed communications networks (such as the Internet) poses a number of problems. Each end user's computer may be connected at speeds as slow as 14.4 bits per second or as fast as several megabits per second, and you would like them all to be able to download and display graphics at some sort of reasonable speed. The Internet started as a place where the common coin was text. ASCII text is easy; one byte per character keeps the average missive to a size where near-instantaneous communication is possible. Graphics, however, are much more information-intensive. The proverbial picture worth a thousand words can actually translate into hundreds of thousands of words when it comes to sending that picture over the Internet. To deal with network graphics, people have developed a toolkit of structuring conventions and compression tricks that make possible the graphics-intensive Web that we know and love. This section will provide an overview of this vocabulary and point out how GIF, PNG, and JPEG (what we will call the
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Network Graphics Basics
Creating graphics file formats for distribution over variable speed communications networks (such as the Internet) poses a number of problems. Each end user's computer may be connected at speeds as slow as 14.4 bits per second or as fast as several megabits per second, and you would like them all to be able to download and display graphics at some sort of reasonable speed. The Internet started as a place where the common coin was text. ASCII text is easy; one byte per character keeps the average missive to a size where near-instantaneous communication is possible. Graphics, however, are much more information-intensive. The proverbial picture worth a thousand words can actually translate into hundreds of thousands of words when it comes to sending that picture over the Internet. To deal with network graphics, people have developed a toolkit of structuring conventions and compression tricks that make possible the graphics-intensive Web that we know and love. This section will provide an overview of this vocabulary and point out how GIF, PNG, and JPEG (what we will call the web graphics formats) actually implement these concepts.
No, this section is not about hunting and fishing. Web graphics formats can be thought of as data streams broken up into fields (so much for the outdoor activity metaphor). Everything that is transferred over the Web may be thought of as a data stream—a series of data packets received one at a time and assembled into a sequential data structure. This data structure is in turn divided into fields. The GIF and JPEG formats call these fields blocks, and PNG calls them chunks. Fields are a fixed, predictable data structure stored within an image file whose layout is defined by the file format specification. Typically a field will contain information about an image's dimensions, how the colors are defined within the image, special information needed by a display device to properly render the image, etc. These fields of information are often structured so that it is easy for a program displaying the image to quickly extract all the information it needs.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Graphics Interchange Format (GIF)
At the beginning of this chapter, I posed five questions about the GIF file format. Well, before we delve into the inner workings of the GIF format, here are some answers:
  • Why isn't a GIF with 129 colors smaller than one with 256 colors?
    This is because the number of entries in the color table is not directly stored within the GIF file; it is actually calculated from the number of bits used to represent each index in the table. The total number of entries in the color table is calculated by raising 2 to the power of the number of bits per entry; thus if the number of bits used per index is 8, there are a maximum of 256 entries in the table (28). A 129-color palette requires 8 bits per entry (because 27 = 128), which means that a 256-color palette will be allocated, even if only 129 colors are actually used.
  • Can a multi-image GIF have more than one transparent color?
    Yes, each image in a multi-image sequence may have its own local palette, which may contain its own transparent color index. The transparent index for a color table is defined in the Graphics Control Extension (described later). According to a strict interpretation of the GIF specification, only one graphic control block is allowed per image, so each image can have its own unique transparent color.
  • What is the maximum color depth of a GIF?
    The maximum color depth of a GIF is 256 colors, because each pixel is represented as a single byte, which can be an index to at most 28 = 256 colors.
  • How does a decoder program know that a file is a GIF?
    The first 3 bytes of the GIF file are always the hexadecimal string "0x47 0x49 0x46" which is the string "GIF" in ASCII characters. Bytes 4-6 are either the hex string "0x38 0x37 0x61" or "0x38 0x39 0x61," which is either "87a" or "89a" in ASCII, depending on the version of encoding used. Unfortunately, some applications (such as certain web browsers) determine the content of the file solely by the extension used on the filename.
  • How can I make the smallest possible multi-image file?
    This is a loaded question, but one way would be to make sure that your image manipulation software is using global palettes when color tables can be shared by more than one image. Each image may have its own local palette, which (for a 256-color palette) can add 768 bytes per palette. That's actually not that much, but every bit counts. Also, the GIF format allows you to provide an offset (and to choose a "disposal" method) for each image in a file; the size of the file may be reduced by removing redundant data (either by creating a bounding box around the changed area and cropping away the rest, or by analyzing the image on a pixel-by-pixel basis and setting unchanged pixels to transparent) and allowing underlying frames to show through. An example is given in Chapter 9, and the Gimp (described in Chapter 7) provides an Animation Optimizer that automates this technique.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Portable Network Graphics (PNG)
In the "GNU's Not Unix" tradition of recursive acronyms, PNG may unofficially be taken to stand for "PNG's Not GIF." PNG was designed as an open standard alternative to GIF, and it plays that role very well. PNG will not completely replace GIF, however, if only because PNG can only store one image per file and there are a million web pages out there that are full of GIF images.
A PNG file is assembled as a series of chunks which, for all intents and purposes, are the equivalent of GIF's blocks. PNG just has a friendlier name for the structure. The 1.0 PNG specification defines a number of standard chunks, of which four are considered "critical chunks." At least three of the critical chunks must be present in every valid PNG format file. The non-critical standard chunks are sometimes called "ancillary chunks." The critical and ancillary chunks, along with a short description of each, are listed in Table 1.2 and Table 1.3. Critical chunk codes begin with a capital letter; ancillary chunks begin with a lowercase letter.
Table 1.2: Critical Chunks
Name
Description
Code
Header chunk
Global information about the image
IHDR
Palette chunk
A palette (optional)
PLTE
Image Data chunk
The compressed image data
IDAT
Image End chunk
The end-of-file marker
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
JPEG
JPEG stands for Joint Photographic Experts Group, which is the name of the committee set up by the International Standards Committee that originally wrote the image format standard. The JPEG committee has the responsibility of determining the future of the JPEG format, but the actual JPEG software that makes up the toolkit used in most web applications is maintained by the Independent JPEG Group (http://www.ijg.org).
The JPEG standard actually only defines an encoding scheme for data streams and not a specific file format. JPEG encoding is used in many different file formats (TIFF v.6.0 and Macintosh PICT are two prominent examples), but the format used on the Web is called JFIF, an acronym that stands for JPEG File Interchange Format, which was developed by C-Cube Microsystems (http://www.c-cube.com) and placed in the public domain. JFIF became the de factostandard for web JPEGs because of its simplicity. When people talk about a JPEG web graphic, they are actually referring to a JPEG-encoded data stream stored in the JFIF file format. In this book we will refer to JFIFs as JPEGs to reduce confusion, or to further propagate it, depending on your point of view.
To create a JPEG you must start with a high-quality image sampled with a large bit depth, from 16 to 24 bits, for the best results. You should generally only use JPEG encoding on scanned photographs or continuous-tone images (see the section Section 1.1.6 earlier in this chapter).
JPEG encoding takes advantage of the fuzzy way the human eye interprets light and colors in images by throwing out certain information that is not perceived by the viewer. This process creates a much smaller image that is perceptually faithful to the original. The degree of information loss may be adjusted so that the size of an encoded file may be altered at the expense of image quality. The quality of the resulting image is expressed in terms of a Q factor,which may be set when the image is encoded. Most applications use an arbitrary scale of 1 to 100, where the lower numbers indicate small, lower-quality files and the higher numbers indicate larger, higher-quality files. Note that a Q value of 100 does not mean that the encoding is completely lossless (although you really won't lose much). Also, the 1 to 100 scale is by no means standardized (the Gimp uses a to 1.0 scale), but this is the scale used by the IJG software, so it is what we will use here. There are a few guidelines for choosing an optimal Q factor:
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
References
The Encyclopedia of Graphics File Formats, 2nd Edition (an excellent tome covering more than 100 file formats, including GIF, PNG and JPEG):
Murray, James D., and William vanRyper, O'Reilly & Associates, 1996
License Information on GIF and Other LZW-based Technologies:
http://corp2.unisys.com/LeadStory/lzwfaq.html
The PNG Specification:
ftp://ftp.uu.net/graphics/png/documents/png-1.0-w3c-single.html.gz
An explanation of compositing partially transparent pixels from the PNG specification:
ftp://ftp.uu.net:/graphics/png/documents/png-1.0-w3c-single.html.gz#D.Alphachannel-processing
The Deflate algorithm that PNG uses as its compression method:
ftp://ds.internic.net/rfc/rfc1951.txt
Some information on the CRC algorithm (ISO 3309):
http://bbs-koi.uniinc.msk.ru/tech1/1994/er_cont/crc.htm
MNG home page:
http://www.cdrom.com/pub/mng/
JPEG home page:
http://www.jpeg.org/
JPEG FAQ:
http://www.faqs.org/faqs/jpeg-faq/
JPEG:Still Image Data Compression Standard (a book containing the complete ISO JPEG standards):
Pennebaker, William B., and Joan L. Mitchell, Van Nostrand Reinhold, New York, 1993.
The comp.compression FAQ:
http://www.faqs.org/faqs/compression-faq/
Greg Roelofs' PNG Page with current comparisons of PNG support by different browsers:
http://www.cdrom.com/pub/png/
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 2: Serving Graphics on the Web
When creating graphics, one should keep in mind that not all web browsers handle all "standard" HTML features in the same way. Specifically, browsers have been known to have their own idiosyncratic interpretations of the ALT attribute, client-side image maps, the USEMAP attribute, GIF89a animation, image spacing attributes, transparency, inline PNG/XBM/Progressive JPEG images, the LOWSRC attribute, borders on image links, alignment tags, and scaling tags.
In short, just about every feature that should have a standard implementation has, at one time or another, had different levels of compliance to the standard on different browsers. As of this writing, the most popular browsers implement the HTML 3.2 standards such that they can be trusted with your exquisitely crafted HTML code. However, be sure to do a little research before making your knockout web page depend on some new or proprietary feature. You may be writing off a significant portion of your potential audience who can't see it.
The same could be said for external image-viewing plug-ins. Now that we are several years into the Web Revolution, users are spending more time using the Web to get exactly the information they want and less time "surfing" the waves of information overload. Developers are also realizing the ramifications of the time and costs associated with keeping a web site going until Doomsday. If you are thinking about adding critical images (or even non-critical information, like goofy animated buttons) that require the use of a third-party extensions or external plug-ins, think about it very carefully. Assuming that your web site will be around for a while, where will this technology be in five years? Ten years? It's a pretty good bet that currently adopted standards (PNG, JPEG, even GIF) will be supported well into the future. In general, don't make your users download and install plug-ins to get to your content. Keep in mind that most people on the Web use their browsers for just that: browsing.
That said, there are a few applications that force the use of plug-ins. If you wish to participate in the web audio or streaming media scenes, you're going to have to choose one of the many competitive proprietary solutions that have yet to be universally adopted. The real purview of plug-ins and extensions is in the design of Intranets, where you, the developer, have dictatorial control over what is installed on everyone's desktop. If you have 20,000 images in RAD format that have to be seen on your corporate Intranet, by all means, require your users to use a RAD plug-in! (However, you could also write a script using Image::Magick to batch convert them to PNG files, which could also be used on your public web site, but we'll talk about that in Chapter 5.)
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
The Server and CGI
Generally, when people talk about a " web server," they are referring to two things: a program that accepts a request for resources (i.e., HTML pages, images, applets, DOM objects, etc.) and returns resources, and a collection of resources to return (and, I guess, a computer where they both live, if you want to get technical). When we talk about a web server, we will be referring to the program that does the serving; we assume that there is also a corresponding collection of web pages and images to be served.
The requests made to and responses returned by the web server must be in a standard form that is described by the Hypertext Transfer Protocol (HTTP). Web servers are an inherently simple concept and may be very simple programs or very complicated affairs. The popular Apache web server (http://www.apache.org) falls somewhere in between. Apache takes a modular design approach; it has a very fast, simple core set of operations that may be extended with other modules. Whichever web server you are running, its primary function is the basic capability of handling requests and returning resources.
When a web browser requests an image from a server, the request is in the form of a URL, just like any other HTTP request. This URL points to a file that resides on the same computer as the web server, in its collection of resources. Generally the web server points to a specific root directory which it will use to determine the location of the requested image. For example, the URL http://www.shemp.net/splashscreen.png would point at a file in the Portable Network Graphics (PNG) format located in the web server's root directory. If this URL is requested by a client (and it exists), the web server determines a MIME type for the file and sends back an HTTP header to the client. The web server then reads the data in the file pointed to by the URL and immediately follows the header with a stream of data from the file.
Certain MIME types are said to be registered, i.e., clients implementing the HTTP protocol should at least recognize them as valid MIME types. Some of the registered MIME types of interest are:
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Web Graphics and the Browser
As an example of how a web browser processes and organizes images internally, let's look at the model used by the Mozilla web browser, which is the "open source" release of the Netscape browser. This is a good object of study because the browser's code is freely available under the Netscape Public License, and the code is amply documented at Mozilla headquarters (http://www.mozilla.org).
Mozilla is written in C++. It is designed as a modular system with well-defined tasks handled by different modules. The layout of a page is handled by a layout engine called NGLayout (for Next Generation Layout). NGLayout is built on the open standards for Internet content (HTML, Cascading Style Sheets, and the Document Object Model) and it handles all the tasks associated with creating the layout of a page and rendering all of the page's components. The layout engine draws all of the geometric primitives (such as rules and table borders), places text, and renders images by interacting with a lower-level Image Library. This Image Library is what actually manages the flow, the decoding, and the eventual display of images. When the layout engine calls for an image to be rendered, the Image Library takes the following steps:
  1. A URL is requested by the code that handles the layout of the page. This request could be initiated by the parsing of an <IMG> element or by one of the user interface options, such as the "Show Images" button on the navigation bar. This request is made with the GetImage( ) function.
  2. The Image Library maintains a cache of previously requested and decoded images. With each image request, the Image Library will look for the image in the cache and, if it finds it, will draw the previously requested image from the cache. If it is not in the cache, the Library will open a data stream to get the image data, create a new image object, decompress and decode the data, and store the object in the cache, according to the pragmas that are associated with the file. A request to the Image Library returns an ImageReq
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Presenting Images in HTML
As of the HTML 3.2 specification, there was only one way to include an image in an HTML document, and that was with the <<IMG>> tag. With the HTML 4.0 standard there are two equally supported means of including images—the <<IMG>> tag and the <<OBJECT>> tag. The <<OBJECT>> element is intended as a more open solution to the problem of inserting inline media into web documents. The examples in this book all use the <IMG> tag, because it is very likely that 3.2 will still be the most widely implemented standard for quite some time. However, we will look at both the <IMG> and <OBJECT> forms in this chapter.
The <IMG> element embeds an image in the body of a document (it cannot be used in the head section). The element consists of a start tag without an end tag, and does not include content as such. It is formed according to standard HTML syntax, which is to say it should look like this in its simplest form:
<IMG SRC="someimage.png" >  # include an inline png
In Perl, you may use the HTML::Element module to create image tags. This module is designed to let you build the nodes of an HTML syntax tree with method calls. It can be used as in the following example:
use HTML::Element;         # use this module

# Set attributes when creating the element...
my $img = new HTML::Element 'img', src => 'someimage.gif'; 

# ...or add them later with the attr() method
$img->attr('alt','This is Some Image!');           

# Use as_HTML() to print the element as an html tag
print $img->as_HTML;
The SRC attribute indicates the URL of the image. This can be an absolute or relative address, and it may refer to a file to be read as data or to a script that is to be run to create the proper image output. The syntax in either case is the same:
<IMG SRC="images/staticfile.gif" >
<IMG SRC="cgi-bin/dynamicscript.cgi">
In the first case, the browser reads the file at the given URL (in this case, a local file), interprets the image data within that file, and displays that data inline at the proper place in the document. In the second case, the browser requests the CGI script at the given URL, the web server invokes the script, and the script's output (whatever is written to STDOUT) will be sent to the web browser for inclusion in the web page. From the browser's point of view, we don't really care what language the script is written in, just that it adheres to the Common Gateway Interface and sends valid image data back to us. If the script fails for some reason, the image data coming back will not be in a valid image format, and the browser will display a broken image icon. The broken image icon is shown in Figure 2.2.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Colors and the Web Browser
Images displayed on monitors with a limited number of available colors (typically 256) will be dithered when they are rendered. Netscape and Internet Explorer on Macintosh or Windows systems support a 216-color "web-safe" palette that will be accurately represented on all systems. These 216 colors are a subset of the Macintosh's 256-color system palette; the extra 40 colors are different from colors found in Win32 standard palettes. Colors not in this safe palette will be dithered to varying degrees on 8-bit systems. The 216 colors of the web-safe palette may be modeled as a 6 × 6 × 6 color cube.
Browsers running on Unix platforms running the X11 window system will sometimes use the 6 × 6 × 6 cube (216 colors), a 5 × 5 × 5 cube (125 colors), or even a 4 × 4 × 4 cube (64 colors), depending on how many colors are available to the browser. On an 8-bit system, Netscape may be run with the -install option specified on the command line to start the browser with the 216 color palette.
In later chapters, we will be writing scripts to generate graphics that are to be displayed on web browsers using this limited palette. These scripts will often need to allocate colors in the color table of the image. You may find it useful to have a utility script for computing the nearest color in the 6 × 6 × 6 color cube given a list of RGB values.
The 216 colors of the 6 × 6 × 6 color cube are those for which each of the red, green, and blue values are either 00, 33, 66, 99, CC, or FF (these numbers in decimal are 0, 51, 102, 153, 204, and 255). We'll call these our "safe" values. The following Perl script will take a list of the red, green, and blue values in decimal and will return the closest match for the color in the web-safe palette:
sub WebSafeColor {
    # Returns the closest color in the 216 color web-safe palette.
    #
    my ($red, $green, $blue) = @_; 
    my (@returnlist, $max, $hex);       
   
    # Find closest value in the 6x6x6 "web-safe" color cube, 
    # algorithm described below.
    #
    foreach my $number ($red, $green, $blue) {
        LOOP: for ($max = 25; $max < 281; $max += 51) {
            if ($number <= $max) {
                push @returnlist, ($max - 25);
                $hex .= sprintf("%02X", $max - 25);
                last LOOP;
            }
        }
    }
    return (@returnlist, "#$hex");
}
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
The Once and Future Browser
One of the key ingredients to the success and beauty of the Web is that it started with a set of specifications and requirements that were simple and robust enough so that anyone who wanted to could participate in publishing to the world. As we move into the world of the "Object Web" and more complicated specifications, anyone can still participate on some level, but overall the Web is not as simple a place as it once was. Once XML (eXtensible Modeling Language) is popularized, the field becomes even more cluttered.
Probably the most significant contribution of XML to the web graphics world is the possibility of representing vector graphics within the markup of a document, and allowing the client to render the graphics. Scalable vector graphics have been integrated into the Web, mostly as plug-ins or external viewers for formats such as PDF or CGM. An up-and-coming format is the Precision Graphics Markup Language (PGML), an XML application with an imaging model based on that of PostScript. Scalable graphics represented in a compact, portable vector form will truly kindle a revolution in the look and feel of the Web.
The Synchronized Multimedia Integration Language (SMIL) is another XML-based language that deserves attention. It may be used to represent media actions that may be triggered by various time controls. It will be used to make the Web a more multimedia-friendly place.
Web multimedia and vector graphics are two fields of expansion that are filled with potential. It may be some time before the standards have settled into place, but what's the rush? The Web isn't going anywhere.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
References
The Official Guide to Programming with CGI.pm:
Stein, Lincoln D., John Wiley & Sons, 1998.
Online CGI.pm documentation:
http://www.genome.wi.mit.edu/ftp/pub/software/WWW/
The Idiot's Guide to Solving Perl CGI Problems:
http://www.perl.com/CPAN-local/doc/FAQs/cgi/idiots-guide.html
The Perl-Apache Integration Project:
http://www.perl.apache.org
mod_perl performance tuning guide:
http://perl.apache.org/tuning
HTML:The Definitive Guide:
Musciano, Chuck, and Bill Kennedy, O'Reilly & Associates, 3rd Edition, 1998.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 3: A Litany of Libraries
This chapter provides a tour of some of the more useful free graphics libraries available on the Web. Most of these are implemented inC, but several also have scripting interfaces for Perl, Python, or Java. The average user will probably not use these resources directly, but a number of them (the Independent JPEG Group's libjpeg and the libpng libraries, for example) are used extensively by web browsers and by other graphics packages such as ImageMagick and the Gimp, so it is good to know about their existence. It is also interesting to follow up on some of these packages and find out how standards and libraries are developed and supported.
The majority of the packages described in this chapter are covered by the GNU General Public License, and they are designated with GPL as their licensing scheme. A few packages have their own variations on the GPL or other Open Source licenses; these are marked "Open." This does not mean that they necessarily conform to the official Open Source Definition; it just means that they are "open" in spirit, and most likely fit the definition. In any case, this categorization is only meant as an overview. You should really read the individual licenses if you are concerned about reusing code from these libraries (see Section 3.2).
Authors: Jan Hubicka, Thomas A. K. Kjaer, Tim Newsome, and Kamil Toman
URL: http://horac.ta.jcu.cz/aa/aalib/
Platform: any Unix
License: GPL
AA-lib is a low-level graphics library for rendering ASCII art. It was developed by two Czech guys who wanted to be able view the Linux Penguin logo on their old Hercules monitors that weren't capable of displaying graphics. That's as good an excuse as any, and the library that they have created allows anyone to convert graphics to ASCII art, like the Linux Penguin logo shown in Figure 3.1, for example.
Figure 3.1 was created with the AA plug-in for the Gimp (see Chapter 7), which uses aalibto save images as ASCII art, exporting text, HTML, or ANSI escape codes. It also has an option that will generate the HTML for inclusion of ASCII art in the
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Image Support Libraries
The majority of the packages described in this chapter are covered by the GNU General Public License, and they are designated with GPL as their licensing scheme. A few packages have their own variations on the GPL or other Open Source licenses; these are marked "Open." This does not mean that they necessarily conform to the official Open Source Definition; it just means that they are "open" in spirit, and most likely fit the definition. In any case, this categorization is only meant as an overview. You should really read the individual licenses if you are concerned about reusing code from these libraries (see Section 3.2).
Authors: Jan Hubicka, Thomas A. K. Kjaer, Tim Newsome, and Kamil Toman
URL: http://horac.ta.jcu.cz/aa/aalib/
Platform: any Unix
License: GPL
AA-lib is a low-level graphics library for rendering ASCII art. It was developed by two Czech guys who wanted to be able view the Linux Penguin logo on their old Hercules monitors that weren't capable of displaying graphics. That's as good an excuse as any, and the library that they have created allows anyone to convert graphics to ASCII art, like the Linux Penguin logo shown in Figure 3.1, for example.
Figure 3.1 was created with the AA plug-in for the Gimp (see Chapter 7), which uses aalibto save images as ASCII art, exporting text, HTML, or ANSI escape codes. It also has an option that will generate the HTML for inclusion of ASCII art in the ALT < >field of the <IMG> < >tag. You can get the source code for the AA plug-in via the Gimp registry at http://registry.gimp.org/plugins/AA/
Figure 3.1: The AA-lib library may be used to render ASCII art, such as this image of "Tux," the Linux penguin
Author: David Chatenay
URL: http://sunsite.unc.edu/pub/Linux/libs/graphics
Platforms: Linux, SunOS, Solaris, Irix
License: GPL
The Codes Library contains functions written in C that implement several compression algorithms that you may incorporate into your application code to compress and decompress files. The library supports:
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
References
The Comprehensive Perl Archive Network (CPAN):
http://www.cpan.org
The GIF Controversy: A Software Developer's Perspective:
http://www.cloanto.com/users/mcb/19950127giflzw.html
The Open Source Definition:
http://www.opensource.org/osd.html
The GNU General Public License:
http://www.gnu.org/copyleft/gpl.html
The BSD License:
http://www.opensource.org/bsd-license.html
Textmode Quake, a version of the popular Quake game that renders its graphics in ASCII using AA-lib:
http://webpages.mr.net/bobz/ttyquake/
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 4: On-the-Fly Graphics with GD
The GD Perl module is a collection of methods and constants for reading, manipulating, and writing color GIF files. Although it is more limited in scope than the ImageMagick package, its size and speed make it well-suited for dynamically generating GIF graphics via CGI scripts. GD has become the de facto graphics manipulation module for Perl; other modules such as GIFgraph (described in Chapter 6) extend the GD toolkit to easily accommodate specific graphics tasks such as creating graphs and charts.
The GD Perl module is actually a port of Thomas Boutell's gd graphics library, which is a collection of C routines created for manipulating GIFs for use in web applications. Early versions of the GD.pm module simply provided an interface to the gd library, but now GD has its own library that is optimized for use with Perl. This module was ported by Lincoln D. Stein, author of the CGI.pm modules.
This chapter starts with an overview and a sample CGI application that will implement a web-based "chess server" that interactively manipulates the pieces on a chess board. The remainder of the chapter is a more detailed description of the GD methods and constants, with additional information on more advanced topics such as using GD's polygon manipulations functions.
Scripts that use the GD module to create graphics generally have five parts, which perform the following functions: importing the GD package, creating the image, allocating colors in the image colormap, drawing on or manipulating the image, and writing the image to a file, pipe, or web browser. After you've installed the GD module, just follow these five steps:
  1. First you must import the GD methods into your script's namespace with the use function. The command use GD will give you access to all of the methods and constants of the GD::Image, GD::Font, and GD::Polygon classes:
    GD::Image
    The Image class provides the means for reading, storing, and writing image data. It also implements a number of methods for getting information about and manipulating images.
    GD::Font
    The Font class implements a number of methods that store and provide information about fonts used for rendering text on images. Each of the fonts are effectively hard-coded; they are described as a number of bitmap matrices (similar to XBM files) that must be compiled as part of the source during installation on your system. GD provides a limited number of fonts; the GD::Font class exists to make it easier to expand font support in the future.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
GD Jumpstart
Scripts that use the GD module to create graphics generally have five parts, which perform the following functions: importing the GD package, creating the image, allocating colors in the image colormap, drawing on or manipulating the image, and writing the image to a file, pipe, or web browser. After you've installed the GD module, just follow these five steps:
  1. First you must import the GD methods into your script's namespace with the use function. The command use GD will give you access to all of the methods and constants of the GD::Image, GD::Font, and GD::Polygon classes:
    GD::Image
    The Image class provides the means for reading, storing, and writing image data. It also implements a number of methods for getting information about and manipulating images.
    GD::Font
    The Font class implements a number of methods that store and provide information about fonts used for rendering text on images. Each of the fonts are effectively hard-coded; they are described as a number of bitmap matrices (similar to XBM files) that must be compiled as part of the source during installation on your system. GD provides a limited number of fonts; the GD::Font class exists to make it easier to expand font support in the future.
    GD::Polygon
    The Polygon class implements a number of methods for managing and manipulating polygons. A polygon object is a simple list of three or more vertices that define a two-dimensional shape.
  2. Create a new image. To make a new image, you can create a new, empty image object of a given width and height, or you can read an image from a file. To create an empty image, use the new method of the Image class, as in:
    # Create a new, empty 50 x 50 pixel image
    $image = new GD::Image(50, 50) || die "Couldn't create image";
    All image creation methods will return undef on failure. If the method succeeds, it will return a data structure containing the decoded GIF data for the image and store it in the given scalar value. This scalar can only contain one image at a time.
    GD supports three stored file formats: GIF, XBM (black and white X-bitmaps), and GD files. A GD format file is a file that has been written to a file using the
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Sample Application: A Chess Board Simulator
This example implements a very crude (but workable) means for two people to play chess on the Web, using the GD module to do the actual graphical machinations involved in drawing the chess board.
The interface to the chess program is a web page that has the image of a board at the top and three forms at the bottom. The first form is for submitting a move, the second is for starting a new game, and the third is for reloading the board between moves to see if an opponent has moved. The moves are entered through the use of <SELECT> < >input fields, which makes for a slightly clunky interface but simplifies the example for the purposes of this chapter, as we do not have to check for valid input from the user. The interface web page is shown in Figure 4.1.
Figure 4.1: The HTML page resulting from a "new game" command
Note that this is just an example of using GD to manipulate graphics and really shouldn't be used as a chess-playing script. The chess server uses GD's image manipulation methods to create a new board from images of the individual chess pieces, and to move the pieces around on the board. It is the same as if you were sitting across the board from your opponent in a real-life game of chess; nothing is physically preventing you from making an illegal move. Our chess server does not implement an intelligent chess engine. The purpose here is not to demonstrate the inner workings of Deep Blue, but to show how to use GD in a practical application. However, it is still a working chess program; it's just probably more accurate to call it a chess board "simulator."
The submitmove.cgi script creates the HTML page that is the user interface. In turn, it calls the drawboard.cgi script that actually creates the image data and sends it to the web browser. This script uses an external file called currentgame.txt that contains information about who (white or black) is currently moving, the move number, and a list containing the history of moves. Note that in an actual production script, you should explicitly unlink this file before writing to it. Here is the code for
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
The GD.pm Distribution
The GD module is available on every platform on which Perl is available. Several versions of Perl come with GD as a standard part of the Perl distribution, such as the Gurusamy Sarathy Win32 port and the Macintosh port, which has GD precompiled in the MacPerl application. Installation methods vary from platform to platform, but if you've ever successfully installed a Perl module on your system, you shouldn't have a problem installing GD. Refer to the README that comes with the package for platform-specific information.
To download the latest version of GD, check CPAN first via http://www.perl.com. The latest version of GD.pm should also be available at http://stein.cshl.org/WWW/software/GD/GD.html. Thomas Boutell's original C-language version of the libgd library can be found at http://www.boutell.com/gd/gd.html
The copyright to the GD.pm interface is held by Lincoln D. Stein, and it is distributed under similar terms as Perl itself; it is free for any purpose, provided the statement of copyright remains attached to it. The gd C library on which GD is based is covered by a separate copyright held by the Quest Protein Database Center, Cold Spring Harbor Labs, and Thomas Boutell. See the "copying" file that comes with the standard GD distribution for more specific copyright information.
GD provides all of the objects and methods listed in Table 4.1.
Table 4.1: GD's Object Definitions and Methods, Arranged by Category of Use
Category
Method
GD Objects
GD::Image, GD::Font, GD::Polygon
GD::Image object creation and saving methods
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Font Methods
GD provides some limited support for using different fonts when drawing text on images. Strings may be drawn horizontally or rotated 90 degrees vertically. Because of the way that fonts are implemented, there is currently no support for other angles of text rotation, which would require the use of an external rendering engine such as Ghostscript or the FreeType package. The Unix version of GD also comes with the program bdftogd(written by Jan Pazdziora) that will allow you to convert fonts from the Unix bdf format to a format that can be included with GD on compilation. Other programs are available for converting TrueType and PostScript fonts to bdf format.
The five built-in fonts are imported into your script's namespace as the global variables gdGiantFont, gdLargeFont, gdMediumBoldFont, gdSmallFont,and gdTinyFont (see Figure 4.4). They are all monospaced fonts with 256 characters in their character sets. The dimensions of each can be determined with the GD::Font::width and GD::Font::height object methods, or by consulting Table 4.2.
Table 4.2: The Dimensions of the Five Standard GD Fonts
Font name
Width (pixels)
Height (pixels)
gdTinyFont
5
8
gdSmallFont
6
12
gdMediumBoldFont
7
13
gdLargeFont
8
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Polygon Methods
The GD package includes a number of polygon manipulation routines that were not part of the original GD library, just to make life a little easier. All coordinate points are defined with (0, 0) being the upper left corner of the image and (image_width, image_height) being the lower right corner of the image.
The following example shows how you could use the GD::Polygonclass to implement a simple graphing program. The script takes any number of data sets (i.e., a list of y values representing a point on the graph) and writes out a multicolored graph with the area underneath each graph filled in with a different color (see Figure 4.5). The script is meant as a simple practical example of using polygons; note how the area and color of each polygon is managed with associative arrays, and how the polygons are sorted so that they can be plotted in a sensible order:
#!/usr/bin/perl -w
# 
# A polygon example that draws simple graphs.
#
use strict;
use GD;
my @polygons;                    # contains a polygon for each data set

my $max_x = 5;                   # The number of items in a data set
my $max_y = 100;                 # The maximum y value 
my $x_space = 40;                # Pixels between x values
my $width = $max_x * $x_space;   # The width of the image in pixels

# An array containing three data sets
#
my @data = ( [ 70, 85, 55, 45, 50 ] ,
         	    [ 25, 33, 22, 5, 40 ],
         	    [ 50, 67, 32, 24, 77 ]);
         
# Now generate a polygon for each data set by calling the 
# createDataSet() function. This funciton will return a 
# polygon object and the area of the bounding box of the polygon.
#
my %areas;
my ($poly, $area);
foreach my $dataset (@data) {
   ($poly, $area) = createDataSet($dataset);
   push @polygons, $poly;
   $areas{$poly} = $area; 
}

# Now create an image and allocate white as the background color
#
my $image = new GD::Image($width-$x_space, $max_y);
my $white = $image->colorAllocate(255, 255, 255);

# Now sort the polygons in order of decreasing area. We will 
# plot the graphs with the greatest bounding area first, on the 
# assumption that this will minimize the effect of one data set
# being obscured by another, since they will be drawn as filled polygons.
# Note that sorting by the bounding box alone does not guarantee
# that the graphs will be plotted in the right order, but
# it will work for most well-behaved data sets.
#
@polygons = reverse(sort { $areas{$a} <=> $areas{$b}; } @polygons);

# Now draw each polygon. First allocate a random color (note that this
# could be white or the same as another color).
#
my %color;            # keep track of the color index of each polygon
foreach my $poly (@polygons) {
    $color{$poly} = $image->colorAllocate(rand(255), rand(255), rand(255));
    $image->filledPolygon($poly, $color{$poly});
}

# Now go back and stroke each polygon with its color, in case it
# is obscured behind another polygon
#
map { 
    $image->polygon($_, $color{$_});
    } @polygons;

open O, ">graphout.gif";     # open a file
binmode O;
print O $image->gif;         # write the image as a GIF
close O;

sub createDataSet {
    # Take a reference to an array as an argument and return a
    # polygon object and the area of the polygon's bounding box.
    #
    my $dataset = shift;
    my $x = 0;
    my $poly = new GD::Polygon;
    foreach my $y (@$dataset) {
       # translate from origin at upper left corner to 
       # origin at lower left corner
       #
        $poly->addPt($x, $max_y - $y);    
        $x += $x_space;
    }

    # Close the polygon by adding the following points...
    #
    $poly->addPt($width, $max_y);       
    $poly->addPt(0, $max_y);
    my ($l, $r, $t, $b) = $poly->bounds;
    return ($poly, ($r-$l)*($b-$t));
}
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 5: Industrial-Strength Graphics Scripting with PerlMagick
PerlMagick is an object-oriented Perl interface to the ImageMagick image manipulation libraries. It is probably more accurate to call it the Image::Magick Perl module. It is a more robust collection of tools than GD. In addition to GIF it can read and write many file formats including PNG, JPEG, PDF, EPS, and Photo CD. In fact, PerlMagick is a graphics acronym goldmine with over 60 different file formats supported. Unlike GD, ImageMagick provides methods for creating animated GIFs dynamically. It can be used effectively in CGI applications, but its real strength is as an offline workhorse for the batch conversion and manipulation of images. This chapter will illustrate both uses and provide a complete overview of the functions available within PerlMagick.
Some PerlMagick applications that are particularly useful when programming for the Web include:
  • Creation of animated GIFs
  • Creation of composite images from a sequence
  • Easy reduction or thumbnailing of images (see also Chapter 10)
  • Setting transparent colors
  • Color reduction and optimization for various browsers
  • Conversion between 60+ formats (e.g., Photo CD to JPEG, PDF to GIF, GIF to PNG)
  • Special effects and addition of text to images
  • Gamma-correction of images for color-sensitive web applications
PerlMagick is used in a number of chapter in this book. See especially Chapter 9, and Chapter 10 for some elaborate working examples using PerlMagick.
All of the PerlMagick methods and attributes are defined by the Image::Magick module. To make these methods and attributes available within the name space of your script, include this module with Perl's use function:
use Image::Magick;
You can then use the new constructor to create an image object that is capable of reading, manipulating, and writing images:
$image = Image::Magick->new;   # $image is a new image object
Once you are finished with an image object, you should destroy it to conserve memory resources. Do this with undef:
undef $image;                  # destroy an image object
You can also "empty" all of the images from an object and keep the object around for further use with:
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Learn PerlMagick in 21 Seconds
All of the PerlMagick methods and attributes are defined by the Image::Magick module. To make these methods and attributes available within the name space of your script, include this module with Perl's use function:
use Image::Magick;
You can then use the new constructor to create an image object that is capable of reading, manipulating, and writing images:
$image = Image::Magick->new;   # $image is a new image object
Once you are finished with an image object, you should destroy it to conserve memory resources. Do this with undef:
undef $image;                  # destroy an image object
You can also "empty" all of the images from an object and keep the object around for further use with:
undef @$image;                 # delete all images from an object
undef $image->[2];             # delete only the third image from a sequence
Here is a quick example script that exhibits most of the functionality of the Image::Magick module:
#/usr/bin/perl -w

use strict;
use Image::Magick;

# $image will be our image object and $status will be the return value
# that we can check for a successful operation
#
my($image, $status);

# Instantiate a new image object
#
$image = Image::Magick->new;

# Read in every image in the image subdirectory whose name starts
# with 'dog' and ends with a .gif extension...
# These will be the frames in an animated sequence, presumably of a dog...
#
$status = $image->Read('images/dog*.gif');
warn "$status" if "$status";

$image->Transparent('#FFFFFF');          # Set white to transparent
$image->Zoom('50%');                     # Scale the whole sequence

# Use the 216 web-safe color cube
#
$cube = Image::Magick->new;
$status = $cube->Read('NETSCAPE:');
warn "$status" if "$status";
$image->Map($cube);

# Write the whole sequence out as an animated gif file
#
$status = $image->Set(loop=>0,           # loop forever
                      dispose=>2);       # revert to background 'twixt frames
$status = $image->Write("gif:webdog.gif");  
warn "$status" if "$status";

# Neatly dispose of our object
#
undef $image;
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
The ImageMagick Distributionand PerlMagick
ImageMagick was written by John Cristy and has been constantly developed and revised for several years. Newly modified versions are sometimes available every few weeks on the ftp site (see Section 5.2.2 later in this chapter). As of this writing the current version of ImageMagick is 4.1.x, with minor version numbers incremented once every few months. PerlMagick is currently at version 1.4x. The copyright to ImageMagick and PerlMagick is held by E.I. du Pont de Nemours and Company, who graciously allow the packages to be distributed (and redistributed) for free. As of ImageMagick 4.0, PerlMagick comes with the standard distribution.
Table 5.1 through Table 5.3 show the various file formats supported by ImageMagick. Most of these formats are not directly viewable on the Web, though external viewers that may be launched from a web browser are available for many of the formats. ImageMagick can be used to convert these other files to a format viewable on a browser, or to create thumbnails that link to a downloadable version of the full image. See Chapter 10 for scripts to convert and maintain groups of images in multiple formats.
Table 5.1: Supported File Formats for Web Publishing
Extension
Description
Multi-Image Files?
GIF, GIF87, GIF89a
8-bit color GraphDics Interchange Format
JPEG
Joint Photographic Experts Group's compressed 24-bit color JFIF format
MPEG, M2V
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Image::Magick Attributes and Methods by Category
Image::Magick acts as a transparent interface for manipulating a wide range of image file formats. Most of these formats have certain attributes in common, though they may each implement the