The X Window System is complex, but it is based on a few premises that can be quickly understood. This section describes these major concepts.
The first and most obvious thing to note about X is that it is a windowing system for bitmapped graphics displays.[1] It supports color as well as monochrome and gray-scale displays.
A slightly unusual feature is that a display is defined as a workstation consisting of a keyboard, a pointing device such as a mouse, and one or more screens.[2] Multiple screens can work together, with mouse movement allowed to cross physical screen boundaries. As long as multiple screens are controlled by a single user with a single keyboard and pointing device, they comprise only a single display, as shown in Figure 1-1.
The next thing to note is that X is a network-oriented windowing system. An application need not be running on the same system that actually supports the display. While many applications can execute locally on a workstation, other applications can execute on other machines, sending requests across the network to a particular display and receiving keyboard and pointer events from the system controlling the display.
At this point, only TCP/IP and DECnet networks are supported by the X Consortium and most vendors, though that may change before long.
The program that controls each display is known as a server. At first, this usage of the term server may seem a little odd—when you sit at a workstation, you tend to think of a server as something across the network (such as a file or print server) rather than the local program that controls your own display. The thing to remember is that your display is accessible to other systems across the network, and for those systems, the code executing in your system does act as a true display server.
The server acts as an intermediary between user programs (called clients or applications) running on either the local or remote systems and the resources of the local system. The server (without extensions) performs the following tasks:
Allows access to the display by multiple clients.
Interprets network messages from clients.
Passes user input to the clients by sending network messages.
Does two-dimensional drawing—graphics are performed by the display server rather than by the client.
Maintains complex data structures, including windows, cursors, fonts, and “graphics contexts,” as resources that can be shared between clients and referred to simply by resource IDs. Server-maintained resources reduce the amount of data that has to be maintained by each client and the amount of data that has to be transferred over the network.
Since the X Window System makes the network transparent to clients, these programs may connect to any display in the network if the host they are running on has permission from the server that controls that display. In a network environment, it is common for a user to have programs running on several different hosts in the network, all invoked from and displaying their windows on a single screen, as shown in Figure 1-2.
In practice, each user is sitting at a server and can start applications locally to display on the local server or can start applications on remote hosts for display on the local server, if the remote hosts have permission to connect to the local server. All other users in the network are in a similar situation—they can run applications on their own system or on yours, but they will, for the most part, be displaying on their own server. This use of the network is known as distributed processing. Distributed processing helps solve the problem of unbalanced system loads. When one host machine is overloaded, the users of that machine can arrange for some of their programs to run on other hosts.
One extreme of this arrangement is the PC server or X terminal. Because these single-task systems can run only the X server (and sometimes a window manager), a user sitting at one of these servers must run all clients on systems across the network, with their results displayed on the PC or X terminal screen. This makes the single-tasking PC or X terminal look and work just like X on a multitasking workstation.
Another important concept in X programming is that applications do not actually control such things as where a window appears or what size it is. Given multiprocessor, multiclient access to the same workstation display, clients must not be dependent on a particular window configuration. Instead, a client gives hints about how long and where it would like to be displayed. The screen layout or appearance and the style of user interaction with the system are left up to a separate program, called the window manager.
The window manager is just another program written with Xlib, except that it is given special authority to control the layout of windows on the screen. The window manager typically allows the user to move or resize windows, start new applications, and control the stacking of windows on the screen, but only according to the window manager’s window layout policy. A window layout policy is a set of rules that specify allowable sizes and positions of windows and icons.
Unlike citizens, the window manager has rights but not responsibilities. Programs must be prepared to cooperate with any type of window manager or with none at all (there are fairly simple ways to prepare programs for these contingencies). The simple window manager twm does not enforce any window layout policy, but clients should still assume that there could be one. For example, the window manager must be informed of the desired size of a new window before the window is displayed on the screen. If the window manager does not accept the desired window size and position, the program must be prepared to accept a different size or position or be able to display a message such as “Too small!”
If you are having trouble visualizing this situation, imagine a window manager where no windows are allowed to overlap. This is known as a tiled window manager. The Siemens RTL tiled window manager lets only transient windows (such as pop-up menus) overlap. The twm window manager, on the other hand, is referred to as real-estate-driven because keyboard input is automatically assigned to whatever window the pointer currently happens to be in.
There is at least one other window manager variety that you will encounter, called a listener or click-to-type. Its distinguishing feature is that it assigns all keyboard input to a single window when that window is selected by clicking on it with the pointer. A listener may or may not allow windows to overlap. Apple Macintosh™ users will recognize this type of interface.
X is somewhat unusual in that it does not mandate a particular type of window manager. Its developers have tried to make X itself as free of window management or user interface policy as possible. And, while the X11 distribution includes twm as a sample window manager, individual manufacturers are expected to write their own window managers and user interface guidelines. In fact, two commercial window managers with user interface guidelines are already becoming established. They are olwm, the OPEN LOOK™ window manager from AT&T and Sun, and mwm, the Motif™ window manager from Open Software Foundation. The OSF Motif window manager mwm, and OPEN LOOK window manager olwm both can be configured to be real-estate-driven or click-to-type.
In the long run, the developers of X may well have made the right choice, in that the lack of clear user interface guidelines will allow a period of experimentation in which the marketplace could come up with better designs than are presently available. Some industry observers, however, decry this move, pointing out that it undercuts X’s appeal as a standard user platform—X programs may be portable across systems from multiple vendors, but if users have to deal with a different user interface on each system, half the benefit of that portability will be lost. Until a clear user interface standard emerges from the marketplace, developers must be careful to write their programs in such a way that they can run under different window managers and user interface conventions.
As in any mouse-driven window system, an X client must be prepared to respond to any of many different events. Events include user input (keypress, mouse click, or mouse movement) as well as interaction with other programs. (For example, if an obscured portion of a window is exposed when another overlapping window is moved, closed, or resized, the client must redraw it.) Events of many different types can occur at any time and in any order. They are placed on a queue in the order they occur and usually are processed by clients in that order. Event-driven programming makes it natural to let the user tell the program what to do instead of vice versa.
The need to handle events is a major difference between programming under a window system and traditional UNIX or PC programming. X programs do not use the standard C functions for getting characters, and they do not poll for input. Instead there are functions for receiving events, and then the program must branch according to the type of event and perform the appropriate response. But unlike traditional programs, an X program must be ready for any kind of event at any time. In traditional programs the program is in control, asking for certain types of input at certain times. In X programs, the user is in control most of the time.
The final thing to know about X is that it is extensible. The code includes a defined mechanism for incorporating extensions, so that vendors are not forced to hack up the existing system in incompatible ways when adding features. These extensions are used just like the core Xlib routines and perform at the same level. Some extensions are standards of the X Consortium, such as the Shape extension, which supports non-rectangular windows, and the X Input extension, which supports input devices other than keyboard and mouse. There is also a standard 3-D graphics extension called PEX, with two APIs called PHIGS and PEXlib. Other extensions are under development.
Extensions have both client-side and server-side code. A server vendor is not required to provide support for all the standard extensions. Therefore, before using an extension, you must query the server to see if the extension is supported. At this writing, only the Shape extension is widely supported.
[1] In bitmapped graphics, each dot on the screen (called a pixel, or picture element) corresponds to one or more bits in memory. Programs modify the display simply by writing to display memory. Bitmapped graphics are also referred to as raster graphics, since most bitmapped displays use television-type scan line technology: the entire screen is continually refreshed by an electron beam scanning across the face of the display tube one scan line, or raster, at a time. The term bitmapped graphics (or memory-mapped graphics) is more general, since it also applies to other dot-oriented displays, such as LCD screens. We assume that you are familiar with the basic principles of bitmapped graphics.
[2] As of Release 5, there is a standardized extension called the X Input Extension that supports multiple input devices other than keyboards or mice.
Get XLIB Programming Manual, Rel. 5, Third Edition now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.