Programming Firefox

Chapter 1. Firefox and Friends

The Firefox browser is a collection of C++ libraries designed to be assembled into any number of applications that you can run on machines with any of the major desktop operating systems (Windows, OS X, Linux, etc.).

A browser’s functionality combines what the user sees—through web content—and underlying technologies used to access information and to decode, render, and stylize content. Although much of this book focuses on the XUL interface language to build application interfaces, it also touches on the evolving Internet standards that extend the breadth and depth of information available through the Web.

Mozilla to Firefox and Thunderbird

Most people say the World Wide Web was “born” in the spring of 1993, when Jon Mittelhauser and Marc Andreesen, working out of the University of Illinois, developed what would become the first widely acceptable graphical interface to the Internet.

The software was known as Mosaic, and its widespread acceptance provided the first indication that the Internet was something that could interest (and provide value to) business users and the public.

Marc Andreesen went on to start Netscape Communications Corporation, a company that focused on the commercialization of the Netscape Navigator browser. In 1998, Netscape turned development of the browser over to the open source community in the form of the Mozilla Organization. The Mozilla community rewrote the Netscape code base and released the first commercial product in the form of Netscape 6.

The browser was, unfortunately for Netscape, technically and commercially disappointing. Netscape continued to support Mozilla-based browsers through 2003, when America Online (which owned Netscape) shut down operations, leaving the Mozilla organization on its own to continue development and commercialization of the browser code.

The Mozilla browser was actually a suite of applications that incorporated both a browser and an email and newsreader client. To reduce the perceived “bloat” of the suite, Mozilla decided to break the browser portion out of the suite.

The initial browser was referred to as Phoenix, was renamed Firebird, and finally was released as Firefox version 1.0 in November 2004.

Today the Mozilla Foundation operates as a nonprofit organization to manage the open source development aspects of the program. The foundation owns the for-profit Mozilla Corporation, which focuses on browser support for end users and commercialization programs.

The Mozilla code base now supports the Firefox browser, the Thunderbird email client (Figure 1-1), and the Camino browser for OS X. The complete application suite (formerly the Mozilla suite) is now branded as the SeaMonkey Internet application suite. All the browser engines implement the same rendering logic (the code that paints the screen web content), known as the Gecko rendering engine. The Mozilla suite offers tools to allow developers to embed the Gecko engine alone in customized applications.

Figure 1-1. Firefox browser and Thunderbird email client

At its inception, much of the “buzz” around the original Mozilla browser concerned the ability to extend the functionality of the Cross-Platform Component Model (XPCOM) libraries on which it is built. Using XPCOM services and interfaces, a C++ (or JavaScript) programmer could build new components and add new logic to the underlying Mozilla engine.

Although many developers still build on and extend the XPCOM library, the lion’s share of developers’ work focuses on extending the interface and functionality using “higher-level” services, such as the XML Bindings Language (XBL). Regardless of the specific underlying technologies, the interfaces of all Mozilla applications are represented as XML User Interface Language (XUL) files.

XML Technologies

As I just mentioned, XUL stands for XML User Interface Language. In fact, many of the key technologies discussed here are based on XML, or the Extensible Markup Language. As the XML form so dominates both the interface design and the structure of displayed documents, it makes sense to consider what XML is, why it is so important, and what impact it has on electronic document structure.

XML History

XML has its roots in the Standard Generalized Markup Language (SGML). SGML was developed out of a 1960s IBM project to develop documents whose content could be machine-readable. SGML was a metalanguage, a self-describing form that allowed document contents to describe how it was encoded, facilitating machine-driven typographic processes and, eventually, decoding and cataloging.

But SGML was very complex, and with the advent of the “GUI-friendly” Web, work was initiated to carry over some of SGML’s advantages of portability to Internet-rendered documents.

In 1995, work began under the auspices of the World Wide Web Consortium (W3C), and XML version 1.0 became a consortium recommendation in February 1998.

XML’s power lies in a simple tree structure of text fields, and the capability to define document types that enable decoders to interpret text fields in different ways. The tree structure means that any software accessing a well-formed XML file “knows” how to traverse the contents, which themselves are a feature of some utility.

But more exciting is the capability of an XML document to include a document type reference that adds a context to the tree elements, giving meaning to the document’s content. For example, an XML document type can define a row as a horizontal alignment of text, but a different document type can define a row as a portion of a mathematical formula. That context can be used to direct the document renderer to display graphics tables or math formulas.

XUL (“zool”) files are themselves XML documents. A document namespace field instructs the browser logic that the XUL content is to be interpreted and painted according to a XUL document type. The Firefox framework is “smart” enough so that if other portions of the document need to be drawn as HTML elements, a namespace prefix can be attached to those elements to “switch” the rendering into HTML mode. Designers no longer need to build one monolithic GUI file structure—different display types can now be constructed and mixed together to extend the widget vocabulary.

XSLT and XPath

The design of XML as a well-defined structure of trees certainly makes it easier to develop software that programmatically parses XML content for some purpose.

Because the word document generally implies some form of static data, it becomes practical to develop declarative processes that can just as easily decode XML files. Two companion standards have now stepped into the fray.

The Extensible Stylesheet Language (XSL) was designed to provide a broader range of tools to modify the style of XML documents. Where Cascading Style Sheets (CSS) were designed to alter the appearance of documents, XSL allowed structural changes in the document (e.g., changing the order of content, removing content, etc.). The use of XSL to transform document content became known as XSL Transformations, or XSLT.

XSL needed a robust tool to reference the treelike content of the XML files it was to transform. XPath is (yet another) standard that provides a straightforward method to reference the nodes of the XML input document. XSL uses XPath to find elements in an XML file that are output to a browser (or to an external file).

Today you can apply all three of these standards in the form of declarative XSLT files or through plug-ins to languages such as Java and PHP.

Here’s what this means for the Firefox environment:

As an XML file, the XUL file is subject to use of CSS to modify the appearance of its widgets; one interface file can be given a completely different look (color, graphical look) with stylesheets. (Much of the concept of different Firefox “skins” is based on XUL and stylesheets.)
As a rendering engine, the Firefox framework was designed to handle a number of different XML-based display standards. (Chapter 8 covers one such transformation of tabular data into graphical renderings.)

RDF

Most developers have heard of the Semantic Web, a term used to describe how information and data can be interconnected for computer access. The Semantic Web for computer access is not the same as the World Wide Web and browser access.

Browsers know how to interpret and render content by decoding web pages. Internet sites organize information for the purpose of communicating information to a user. Neither the browser nor individual web sites make it their business to connect the information behind the web page—to interpret the biography of the person whose image is displayed and to associate it with the subject’s technical expertise for connection to career search engines. Such connections are the domain of the Semantic Web initiative, a program built on common formats with the aim of integrating and combining data from diverse sources. To succeed at this task, computers need access to information about the information being sent to browsers and sites.

The method to encode such required metadata (information about information) is the Resource Description Framework (RDF), a W3C standard for encoding knowledge.

RDF is often implemented through XML-formatted files that encode a graph of relationships—nodes containing names and values that a computer can process to interpret the nature of the information store. RDF is used in the Firefox framework to manage a number of internal data structures, such as its internal bookmark references. Commercial implementations include applications in online publishing and information distribution (Really Simple Syndication [RSS]).

The Firefox framework has specialized template processing logic designed to access and display RDF content with little procedural code (see Chapter 6).

CSS

CSS is a mechanism to add style (color, font types, dimensions) to elements on web documents.

Early web documents included styling information as attributes attached to HTML elements. This approach embedded the structure of an interface with its appearance; changing the look of a web page required a rewrite of the web page to change the values of the style attributes. Developers looked for an alternative method to attach appearance characteristics to elements without complicating the relatively simple HTML syntax. The idea was to develop a syntax in which a designer could generalize the appearance of all the elements of the same type on a page, such as a declaration to set the font for all <P> tags: P:font.family=Helvetica.

Formal development of CSS began with a draft specification in 1995, with the W3C drafting a recommendation in 1996. Today’s stylesheets also cascade—declarations can accumulate the details of an appearance through a sequential layering of styles (e.g., a paragraph within a <div> of one class type can look different from a paragraph enclosed by a <div> of another class type).

CSS made possible improved separation of form from function—you could change almost any physical attribute of a web element with a simple change to the text of a stylesheet defined outside the traditional HTML declarations, or defined in CSS files external to the web page.

In Firefox, CSS not only provides the link between the elements of a XUL page and their appearance, but it also provides the linkage to complex widget behaviors. Firefox makes possible an extension of user interface widgets by using CSS to reference binding files that extend a widget’s function as well as its “look and feel.”

At the Top of It All: The DOM

The Document Object Model (DOM) represents a programmatic interface to web page content. The DOM model is used to define how any XML document may be accessed and manipulated by software.

Early HTML allowed scripting languages limited access to page elements. Scripts could access HTML elements by name or by an element’s position within an HTML form. Programmers used this access to manipulate the interface on the basis of the correctness of an entry or to otherwise manipulate the interface based on the input values.

In 1998, the development community recast the HTML 4.0 specification into an XML syntax. This combination of HTML and XML, in the form of XHTML, meant that web documents could now be accessed through the DOM interface. This XHTML document model goes far beyond simple access to basic forms or HTML elements by name. The XHTML DOM makes public a document interface to allow scripts to access the entire document content as a tree of nodes, each node representing a part of the document. Developers use the DOM specification to traverse the document tree, to access and modify element attributes, and to dynamically modify element styles.

Scripts can also dissect the entire document structure, adding event listeners to all elements of a given class, inserting interface widgets as a specific position in the interface tree, moving elements around the tree, accessing document content, and even removing page elements under program control.

DOM access is the lynchpin to most modern web applications that employ JavaScript to manipulate the user interface. Many of the functions behind Firefox’s more complicated XUL widgets use JavaScript that accesses elements through DOM methods.

Mixing Document Types

One of the most underutilized features of the Firefox framework is the ability to render XML documents of different types—that is, XML documents that may represent HTML along with content representing mathematics (MathML) and Scalable Vector Graphics (SVG).

The preceding section described how you can define different document types. The Firefox framework can render most of those types without the need for an external plug-in. Figure 1-2 shows an example of MathML (the XML rendering of mathematics).

Figure 1-2. Firefox and W3C MathML test page

The capability of Firefox to render such content without the need for plug-ins should not be understated. Once a plug-in is used to render specialized content, additional scripting complexity is added if the designer wishes to “connect” the logic of a web page with the specialized content (e.g., a page that includes an XHTML table full of data and an SVG graphic controlled by the same code). The capability to manage such content makes the Firefox engine a good candidate for simpler, cleaner code to extend interface interactivity.

A number of XML document types exist that promise to bring additional innovation to the Web. Time will tell whether the content development community can take advantage of the delivery platform that Firefox offers:

XHTML
SVG
Geography Markup Language (GML)
MusicXML
RSS
Chemical Markup Language (CML)

Getting Started

The development tools required for XUL development (and to experiment with the examples in this book) are relatively modest.

A good text editor is essential—the editor included with most development systems is more than adequate. If you don’t want to shell out the cash for a full-fledged development system, you still have inexpensive options.

OS X platforms use the XCode developer tools that come with the Mac OS X distributions; users can also subscribe to the Apple Developer Connection to get a copy of the tools.

For the Windows platform, plenty of options are available. One of the most serviceable of such tools is the Notepad++ application from the SourceForge project. Regardless of your preferences, all you really need is an editor with syntax highlighting and syntax folding, the features that allow highlighting of keywords and of code segments bracketed by braces and parentheses, as shown in Figure 1-3.

On Unix platforms, you have a wide range of usable editors from which to choose—from vim and emacs to more user-friendly tools such as Anjuta and KDevelop. It is also possible to use the Eclipse cross-platform development environment for the exercises in this book.

Figure 1-3. Notepad++

Supporting Tools

A number of chapters demonstrate how to integrate XUL applications with server code. Developers may want to implement similar functionality on their machines by installing their own servers.

Apache web server

The web server of choice is the Apache web server (http://www.apache.org). This book uses Apache 2.0 for the PC and Apache 1.3 as bundled with OS X distributions. You should not encounter any compatibility issues with later versions of Apache, the only requirement being its integration with PHP.

PHP

The scripting language used in this book is Personal Hypertext Processor (PHP). Although PHP is most often used to mix HTML with programmatic logic on the server, we will use it more often to serve as frontend logic that bridges requests from the XUL-based client and the database engine. PHP 4 or PHP 5 are more than adequate for the examples in this book. The executables are available from http://www.php.net.

MySQL

A number of examples use a database for user authentication. Although you could simulate a database engine with scripting (PHP) logic, you may want to download the MySQL database engine for a more realistic implementation. Downloads are available from http://www.mysql.org, as shown in Figure 1-4.

Figure 1-4. MySQL Downloads site

Getting the Browser

With a good development editor in hand, the development process requires use of the Firefox browser. The latest version is available from http://www.mozilla.com. When downloading the Firefox browser, you should check the Developer Tools option during the installation process (it often appears as a checkbox on the installation panel). Once the browser is installed, it will automatically be configured to receive any additional updates that the Mozilla.com team makes available.

With the development tools online and the latest version of Firefox on hand, we can start to look at the basic components of the XUL interface.

Get Programming Firefox now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Programming Firefox by Kenneth C. Feldt