Portable Network Graphics (PNG)

In the “GNU’s Not Unix” tradition of recursive acronyms, PNG may unofficially be taken to stand for “PNG’s Not GIF.” PNG was designed as an open standard alternative to GIF, and it plays that role very well. PNG will not completely replace GIF, however, if only because PNG can only store one image per file[7] and there are a million web pages out there that are full of GIF images.

A PNG file is assembled as a series of chunks which, for all intents and purposes, are the equivalent of GIF’s blocks. PNG just has a friendlier name for the structure. The 1.0 PNG specification defines a number of standard chunks, of which four are considered “critical chunks.” At least three of the critical chunks must be present in every valid PNG format file. The non-critical standard chunks are sometimes called “ancillary chunks.” The critical and ancillary chunks, along with a short description of each, are listed in Table 1.2 and Table 1.3. Critical chunk codes begin with a capital letter; ancillary chunks begin with a lowercase letter.

Table 1-2. Critical Chunks

Name

Description

Code

Header chunk

Global information about the image

IHDR

Palette chunk

A palette (optional)

PLTE

Image Data chunk

The compressed image data

IDAT

Image End chunk

The end-of-file marker

IEND

Table 1-3. Ancillary Chunks

Name

Description

Code

Background Color chunk

Defines a background color index in palette or a background shade for a grayscale or RGB image.

bKGD

Primary Chromacities chunk

Stores information that accounts for color differences on different output devices to allow for color correction.

cHRM

Gamma chunk

Stores information about the gamma value of the image relative to its creation environment.

gAMA

Image Histogram chunk

Stores data on the frequency of occurrence of each color in the palette.

hIST

Physical Pixel chunk

Indicates the resolution at which the image should be displayed.

pHYs

Significant Bits chunk

Stores the bit depth of the original source image.

sBIT

Text Data chunk

Stores text in the Latin-1 character set.

tEXt

Time Modified chunk

Stores the time that the image was last changed.

tIME

Transparency chunk

For indexed color images, this chunk stores 1-255 alpha values. For RGB/grayscale images, it can describe a shade or color to be made transparent.

tRNS

Compressed Text chunk

Stores compressed text.

zTXt

For the purpose of getting a grasp of the PNG format, we will only describe the format of the four critical chunks in this section. See Appendix A for an example of a simple PNG decoder written in Perl.

At the very beginning of each valid PNG file is an 8-byte signature that identifies it as PNG-formatted. This signature is not considered a part of any particular chunk. The signature is 8 bytes long, of which the first byte is always the hexadecimal value 0x89 and the remaining 7 bytes are:

PNG\r\n^Z\n

This signature communicates more information than the signatures of other file formats. Each byte provides an added layer of information about the format and integrity of the data that follows.

Byte 0: 0x89

The first byte of the file indicates that the file is in binary form (a text file would only contain ASCII characters in the range 0x00 to 0x7F). It also allows the decoder to detect data corruption if the file had been transferred in text mode, in which case the eighth bit would be stripped from each byte and the first byte would be 0x09 instead of 0x89.

Bytes 1-3: PNG

A human-readable (as opposed to machine-readable) ASCII display of the file format.

Bytes 4-5: \r\n

Transfers between different operating systems can sometimes cause problems with newlines and carriage returns. Converting a file from Unix to Win32 can add a \r to a lone \n, conversion from a Win32 system to Unix may strip the \r, and conversion to MacOS may convert \n to \r.

Byte 6: ^Z

If the file is displayed on the Win32 command line with the TYPE command, the ^Z code will halt the listing of the file.

Byte 7: \n

Some file transfer modes on some systems have problems with carriage returns and newlines.

Unlike GIF, every PNG chunk is laid out in a standard form. A consistent chunk format makes it easy to parse PNG files and allows room for future expansion of the file format. Each chunk in a PNG data stream starts with an 8-byte chunk header and ends with a 4-byte trailer. This chunk header consists of two 32-bit fields, the first of which is the length (in bytes) of the data in the chunk (not including the header or the trailer), and the second is a 4-byte code that identifies the type of the chunk. The codes for the standard chunks are all readable ASCII characters. The standard chunk codes that would be found in this field are listed in Table 1.2.

The header is followed by the chunk data fields, which vary for each type of chunk. The data fields for each critical chunk are described later in their respective sections.

Each chunk ends with a 4-byte trailer called the CRC field, which refers to Cyclic Redundancy Check method of error checking. This field contains a CRC-32 value, which is computed when the file is created and may be compared to the data in the chunk to determine whether or not the data has been corrupted. A sample CRC algorithm is appended to the PNG specification (referenced at the end of this chapter).

The header chunk

The header chunk (IHDR) describes the overall attributes of the PNG image. It is 13 bytes long (not including the 8-byte chunk header and 4-byte CRC trailer) and consists of seven fields:

Width, Height (4 bytes each)

The dimensions of the image. Four bytes are used to represent each dimension, but only 31 bits are used because some languages have trouble with unsigned 4-byte values. Thus the maximum size of a PNG image is approximately 2 Gigapixels × 2 Gigapixels, which at a standard screen resolution of 72 pixels per inch could store an image approximately 470 miles × 470 miles. This would allow you to save a 1:1 life size image of most of New England in a single PNG file! By comparison, a GIF file has a maximum size of 75 feet × 75 feet, which would let you store a 1:1 image of a large house. Of course, your ISP would hate you if you put either of these images on your web page.

Bit Depth (1 byte)

This field contains the number of bits used for each index in the palette chunk or for each sample in a grayscale, RGB, gray+alpha, or RGBA image. Each index in an indexed color image can be represented with a bit depth of 1, 2, 4, or 8. A grayscale image can have a maximum bit depth of 16.

Color Type

This field is a code that indicates the method that the image uses to represent colors. Valid values are:

0

Each pixel is a grayscale value

2

Each pixel is an RGB triplet

3

Each pixel is an index to a color table

4

Each pixel is a grayscale value followed by an alpha mask

6

Each pixel is an RGB triplet followed by an alpha mask

Compression Type (1 byte)

This field contains a code that indicates the type of compression used to encode the image data. As of Version 1.0, PNG only supports the Deflate method, so this field should be 0.

Filtering Type (1 byte)

The filter byte contains a code indicating the type of filtering that was applied to the data before it was compressed. At this time the only type of filtering supported is an adaptive filtering method described in the PNG spec, so this field should be 0.

Interlace Scheme (1 byte)

This field contains a code that indicates the type of interlacing scheme in which the data is stored. Currently the defined values are (none) and 1 (two-dimensional Adam7 interlacing, described earlier in the chapter).

The palette chunk

The palette chunk (PLTE) contains a suggested color table to be used when rendering image data. It is required for PNG files that are saved as indexed color images (i.e., if the Color Type field of the header chunk is set to 3). Truecolor images do not need a palette chunk, although they may include one for use on those systems that are incapable of displaying more than 256 colors.

The data section of the palette chunk consists of a list of red, green, and blue values (one byte per color) for each entry in the table. The table may contain between 1 and 256 entries, similar to the GIF color table. Note that while GIF can only use palettes whose size is a multiple of 2, PNG will store palettes in an optimal amount of space (that is, a 129 color palette would take up 768 [256x3] bytes in a GIF, whereas it would only take up 387 [129x3] bytes in a PNG).

The Image Data chunk

The Image Data chunk (IDAT) holds the compressed data for each of the pixels in the image. Multiple Image Data chunks may be stored within the same PNG file. Because each PNG file describes only one image, multiple IDAT chunks are combined into a single image when decoded and displayed. The IDAT chunk contains the compressed data, in addition to the 8-byte chunk header and 4-byte CRC trailer.

Before displaying the data, it must be decompressed and decoded. The information in the header chunk is used to determine the number of bytes used to represent each pixel and the pixel ordering, which will vary depending on whether or not the data is interlaced.

Ancillary chunks

There are a number of ancillary chunks defined by the PNG 1.0 specification that allow you to encode a great deal of information in the image file. For example, the Gamma chunk (gAMA) contains a value representing the gamma characteristics of the device on which the image was created. This value could be used by a rendering client to adjust the gamma level to suit the display of the client, if it is being displayed on a platform with different gamma characteristics.

Most of the ancillary chunks have not been widely implemented by web browsers, although certain features have been added. Web browsers still have to make a lot of memory-versus-speed compromises, so support for such features as 16-bit alpha masks is still a goal for the future.

The PNG specification also suggests that ancillary chunks with an unrecognized type code should not cause an error for the decoder. This allows the definition of custom chunk types that can be tailored to specific applications, similar to GIF’s Application Extension Block.



[7] A PNG variant capable of storing multiple images is in the works. More information about the Multiple-image Network Graphics (MNG) format may be found on the MNG home page at http://www.cdrom.com/pub/mng/.

Get Programming Web Graphics with Perl and GNU Softwar now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.