Intellectual Property and Open Source

Chapter 4. Copyright

The movie Antitrust came out at the height of the dot-com boom. Antitrust was Hollywood’s take on the geek chic of the late 1990s: the story of a few heroic open source hackers taking on an evil, grasping corporation situated in the Pacific Northwest.

Predictably, it was awful.

Buried among all the things that this movie got wrong, though, was one thing it got right: early in the movie, the protagonist is seen wearing a t-shirt that labels him a “code poet.” In one phrase, they captured why software is subject to copyright law—because it is a form of personal expression, not just a means of accomplishing some function.

Understanding the subtle distinctions inherent in that statement is essential to understanding the storms of controversy that inevitably arise around copyright issues.

Copyright in Context

Copyright is probably the most difficult of the four major branches of intellectual property law. Although patent documents (and some aspects of patent practice) are more complex and intricate than the copyright equivalents, the underlying mechanics of the patent system are relatively simple.

Patents can be seen as a straightforward exchange: describe your invention to society and in return receive exclusive control over the use of the invention. After a while, the period of exclusive control ends and everybody receives the benefit of your inventive effort.

Copyright is superficially similar but fundamentally different. Like patent law, copyright is part of the grand bargain discussed in Chapter 1. In return for the creation of knowledge, we also grant to authors a time-limited property right over the fruits of their intellectual efforts.

However, copyright has a far more human dimension than the mechanical results of inventive effort; it is much more about who we are than about what we do. As a result, copyright is much more subtle than patent law. In particular, there are two fundamental differences between patents and copyrights:

Patent law covers function; copyright law covers expression.
You have to work to get your invention into the protected space of patents. You have to work to get your expression out of the protected space of copyright.

The social and legal difficulties of copyright law can mostly be traced to one or both of these fundamental principles.

Expression

The first fundamental principle is that copyright protects personal expression in all its varieties. This is both the great strength and compelling weakness of copyright.

To understand why, think again about the economic difficulty of knowledge creation. In Chapter 1, we discussed how knowledge has network effects; the more people who possess a piece of information, the more valuable the information becomes. Nevertheless, people sometimes don’t want to share their knowledge because they lose the personal advantages that come from keeping the information a secret. The result is a market failure, because the incentives that promote secrecy tend to reduce knowledge-sharing below the optimal level.

Copyrighted content is in an analogous position. Instead of secrecy, though, the problem is control. The copyright model creates strong incentives to expand the creator’s control in many directions, despite the significant societal benefits that come from sharing copyrightable expression.

Expression and personality

Copyrightable expression, by design, is closely tied to the personality of its creator. As a result, people often feel much more strongly connected to their copyrighted works than other forms of intellectual property. Copyright law respects the intimate tie between creator and creation, in part by giving the authors of copyrighted works a very long term of protection—currently the life of the author plus 70 years for most works. During this term, the author is granted control over almost all use of the work as well as rights in derivative works (works adapted from the originally copyrighted item).

This tight control is generally accepted in our society, perhaps because of the personal link between expression and authorship. If you have ever felt ripped off when someone took your idea, your words, or your work and called it their own, you have felt the strong personal pull of copyright.

Expression and society

On the other hand, shared creative expression plays an important role in our society. Human beings are social animals, and we connect with each other through our personal expressions. Considered as a whole, culture is just the product of many personal expressions mixed together.

This is true both on a micro and on a macro level. On the micro level, consider the well-known movie Monty Python and the Holy Grail. This movie is not the result of a single personal expression; rather, it is a collective expression, the result of many people working together. Most obvious is the writing and acting of the Monty Python comedy troupe. There were also, however, creative and expressive inputs from animators, camera operators, costumers, lighting and sound designers, musicians, møøse trainers, and many others. The name “Monty Python and the Holy Grail” is just a shorthand for the collective expressive efforts that went into the movie.

The various movie awards ceremonies are good examples of recognizing the individual creative expressions that went into the whole. In those ceremonies, the collective expression is only recognized once, with the “Best Movie” award. Every other award is given for individual creative expressions incorporated into the whole—the best actors, costumers, editors, directors, and others.

On a macro level, our culture as a whole is tied inseparably to the many bits of expression, both individual and collective, that it contains. Just like Monty Python and the Holy Grail, our “culture” is a shorthand for the expressive efforts of many people. It is not just Picasso, Jane Austen, J.K. Rowling, George Lucas, and Tim Burton. It is not limited to the writers of blogs, the composers of music, the choreographers of plays, and everyone else whose work involves creating copyrighted content. Our culture is also created by you and all the people around you as you talk, write, work, and live each day. It consists entirely of shared copyrightable expression.

Expression and communication

Further, copyrighted expression has become a cultural and symbolic shorthand for communication. For example, saying that someone “is a Homer Simpson” invokes copyrighted expression to communicate a personal point. In a more extreme example, there are people who primarily communicate by quoting other people’s expressions; think of those who respond to almost every question with quotes from Monty Python or Napoleon Dynamite.

As Danah Boyd points out in the article “When Media Becomes Culture: Rethinking Copyright Issues,” (http://zephoria.org/thoughts/archives/2005/09/29/when_media_beco.html), we appropriate more than just words for communication. People use photos and animated “smilies” to show their moods. We associate different ringtones with different people. We display our attitudes, affiliations, and personality by putting logos, pictures, and quotes on our t-shirts. Hip-hop in particular has a long history of sampling—using little bits of other songs remixed into a new composition. We have become like Mrs. Who (from Madeline L’Engle’s A Wrinkle In Time) or Bumblebee (from the 2007 Transformers movie), using other people’s expressions to express ourselves. (It is amusing that this paragraph is itself an example of the point it seeks to communicate.)

Boyd’s point is that, to a certain extent, the “content” industries are victims of their own success. They have successfully managed to get their work incorporated into popular culture. They have done this in part by building upon the cultural heritage of prior shared expression, and in turn, their works have become part of our shared culture, available for a new generation of authors and artists to use.

The problem of control

The problem is that the strong personal protections of copyright are in conflict with the shared nature of culture. The economic benefits of copyright discussed in Chapter 1 are built on controlling expression, while the social and cultural benefits of copyright are based upon sharing expression.

This problem of control puts copyright into a difficult economic position, comparable to the original market failure that prompted the development of copyright. Of course this is an issue with patents too, but the much shorter term of patent protection mitigates the problem. In the copyright world, the combination of strong controls, long terms, and widespread incentives to share sets up a persistent unstable dynamic.

The state of copyright

The powerful individual inclination toward control of copyrightable material has resulted in political pressure for the strengthening of copyright protection. In fact, the story of copyright from the 1700s until about 1990 has been almost entirely a story of lengthening terms (from 14 years to life plus 70 years), increasing scope (from only particular types of works to almost all works), and stronger protections (from restrictions on publishing rights only to restriction of almost all uses).

Only recently have people started feeling a contrary pull toward less restriction on copyrighted material. The primary reason for this shift is the Internet, and more broadly, the rise of digital media. This new technology has empowered a larger percentage of the population to create new works and express themselves artistically, but as new people have entered into the creative space, they have found the well of our common culture increasingly dry.

The “content” industries, having ridden a century-long wave of popular support of strong copyright, have established legal and business models that depend on the existing copyright regime. Meanwhile, popular sentiment is shifting away from support of strong copyright, and people are “voting with their feet” by sharing music and movies, remixing and mashing up content from other providers, and generally disregarding many of the established boundaries of copyright.

The Power of Defaults

The second primary driver of copyright is that expressions are copyrighted by default. This is a relatively new development, and it has changed the fundamental balance of copyright in our economy and in our society.

Defaults have enormous power. Just ask Microsoft; its enormous market share in operating systems and web browsers is almost entirely due to the power of defaults. When you buy a computer, it is possible to get any operating system you want. You can go to apple.com and get a Macintosh, or find someplace that is willing to sell you a computer with Linux (or even other operating systems) preinstalled.

But if you take no unusual steps and just buy the first acceptable computer that you see, you will end up buying Microsoft Windows. It is the default choice, and because it works well enough, and is available, it has become most people’s preferred choice.

Similarly, when Internet Explorer came out in the mid-1990s, most people agreed that it wasn’t as good a browser as Netscape Navigator. Besides, those who were interested in the Internet had already downloaded Netscape; there was usually no reason to change.

As people upgraded their computers and their operating systems, however, they were faced with a choice: go to extra trouble to download Netscape onto the new computer, or just use the pre-installed Internet Explorer. The key market share driver for Internet Explorer, at least initially, wasn’t the quality of the browser; it was the browser’s simple presence on the desktop. It may not have been the best, but it was there, it was good enough, and it worked. In three years’ time, Internet Explorer went from an also-ran to overwhelmingly dominant, largely by the power of being the default.

Defaults are also a key part of the history of copyright. There are two aspects of copyright where the default has changed over time: in the application of copyright protection to a work, and in the nature of works eligible for protection.

Defaults in the Application of Copyright

It used to be that expressions were not copyrighted by default. As with patents, the creator had to explicitly register the work with the United States Copyright Office. Failure to register the work didn’t only mean that it was not copyrighted when it was published, it meant that it could not ever be copyrighted, even later.

Further, copyrighting the work took effort. Not much effort, but it was not economically profitable to spend the few dollars required to copyright each work unless your business model depended upon your legal control of the expression. Therefore, the great majority of the stories, songs, jokes, sayings, and paintings that imbued American culture were in the public domain and freely shared.

This changed with the Copyright Act of 1976. The 1976 act removed the requirement that new works be registered to receive copyright protection. Instead, the act created a system of protection for all “original works of authorship,” published or unpublished, from the moment they were “fixed in a tangible medium of expression.”

Copyright as the default state

This change in defaults was profound; it shifted the landscape around copyright. Before the act, people needed to expend time and effort to have copyright applied to their works. This minimal barrier of registration resulted in a significant drop-off in the application of copyright—only a percentage of all works were copyrighted. Immediately after the implementation of the 1976 act, people needed to expend time and effort to keep copyright from applying to their works. The result was that essentially all new works were copyrighted. Copyright became the natural state of new creative expression.

The change in expectations was so pervasive that a few people started to argue about the existence of the public domain. Under the new law, the public domain was defined in the negative as the absence of copyright protection. According to one scholar, you couldn’t place works into the public domain; you could only decline to enforce your copyright. Similarly, works with expired copyrights weren’t in a place called the “public domain” because there was no such place. Instead, they were works with no-longer-enforceable copyrights.

Defaults and complexity

I mentioned in Chapter 2 that the patent document is the most complex document in the field of intellectual property. The complexity of the patent document is a direct result of the defaults that are applied to technological works.

The scope of a patent’s protection is spelled out in the patent claims. Anything outside those claims is free for someone else to use; the patent language defines an island of protection in the sea of prior art. The patent document is complex because it must deal with the linguistic and legal complexity inherent in defining the boundaries of protection. When people argue about patents, they argue about whether the accused technology lies inside the scope of the patent language.

Copyright is exactly the opposite. Everything is copyrighted by default, so people don’t argue whether the copyright statute is applicable to the problem; it is always applicable. Therefore the documents that deal with copyright grants are simple and straightforward.

Instead, people in copyright lawsuits argue about the exceptions in copyright law. The exceptions are complex because they define the boundaries where copyright becomes inapplicable—oases of unencumbered use in the land of copyright control. When people argue about copyrights, they argue about whether the accused work lies outside the scope of the copyright grant.

Defaults in the Applicability of Copyright

The second change to the defaults in copyright has been the types of works that are eligible for copyright protection. This is a different issue from the defaults in the application of copyright law, but the changes are similar in their reach.

By way of analogy, think of defaults in the area of network security. Assume that you have a default that “a security policy must be applied to all incoming traffic.” That is like the default application of copyright discussed earlier; every new work is measured against the copyright standard, just as every incoming packet is inspected as it comes into a secure network.

On the other hand, simply saying that, “a security policy must be applied” does not tell you anything about which traffic will ultimately be allowed through the firewall. That depends on a completely different set of factors that must be analyzed independently.

When deciding which sort of traffic should be allowed through your firewall, there are two basic choices. You can have a default allow policy, which grants access unless there is a rule in place denying the connection, or you can have a default deny policy, which forbids access unless there is a rule allowing the connection. Default deny policies are considered safer, but they are more work to configure and maintain; any time some new application has to access the network, the firewall rules must be changed to allow the new connection.

Defaults in the scope of works eligible for copyright

The history of copyright in the United States can also be seen as a movement from a default deny to default allow policy for copyright. Under previous copyright acts, only specifically enumerated types of works were eligible for copyright protection. For example, the Copyright Act of 1790 (the first copyright act instituted in the United States) only allowed protection for books, maps, and charts. If you created something else, it was ineligible for copyright protection.

This was the default deny policy of copyright at work. Unless there was a specific provision in the law allowing copyright protection for your category of work, you had no protection at all.

The result of this policy was tremendous pressure on Congress to amend the Copyright Act to allow new types of protections and new types of works. For example, the Copyright Act was amended in 1802 to allow “historical and other prints.” Then it was amended to provide protection for paintings and musical compositions. It was amended again to provide protection for dramatic works, photographs, and sculptures. Each time a new medium came to the forefront of the copyright scene, the law had to be amended to allow protection.

The tipping point came in the early 1900s. You may be familiar with player pianos that read piano rolls—sheets of paper with perforations representing different notes. Although piano rolls allowed the reproduction of pieces of music, a 1908 court case called White-Smith v. Apollo decided that they were not in the allowed category of sheet music, and were thus not protectable. The court stated:

These perforated rolls are parts of a machine which, when duly applied and properly operated in connection with the mechanism to which they are adapted, produce musical tones in harmonious combination. But we cannot think that they are copies within the meaning of the copyright act.

Sheet music publishers were outraged. Partially as a result of this decision, Congress passed a revision to the Copyright Act the next year. The Copyright Act of 1909 gave the copyright owner of a musical work the exclusive right “to make any arrangement or record in which the thought of an author may be recorded and from which it may be read or reproduced.” Recording studios are still vigorously applying the controls this act granted them when they prosecute people for exchanging MP3 files.

In addition, the 1909 act took the first step toward a default allow policy for copyright. Rather than just setting up a new protected statutory category for piano rolls, Congress decided to try to handle this situation in a more flexible and permanent manner. Specifically, the 1909 Copyright Act was much more expansive in its language when it described what would be considered a “Copyrightable Work” (some individual classifications are omitted here; emphasis in original):

The works for which copyright may be secured under this Act shall include all the writings of an author.
[The] application for registration shall specify to which of the following classes the work in which copyright is claimed belongs: Books, ... Periodicals, ... Works of art, ... Photographs, ...
Provided nevertheless, That the above specifications shall not be held to limit the subject-matter of copyright as defined in section four of this Act, not shall any error in classification invalidate or impair the copyright protection secured under this Act.

Under this new, more flexible language, all works were swept into one basic category—the “copyrighted work.” All copyrighted works received the same basic protection. There were a few classes of works, such as dramatic works, that received additional protection. However, the creation of an omnibus class of copyrighted works significantly simplified the administration of copyright under the 1909 act. When new forms of art were developed, such as films, they could be included under “all the writings of an author” and would be covered by copyright.

The Copyright Act of 1976 was the culmination of this evolution. Just as the 1976 act changed the defaults for the application of copyright to new works, it also completed the transition from a default deny to a default allow policy for the types of works eligible for copyright protection. The 1976 act declared that copyright protection could apply (and would apply) to all “original works of authorship.” This intentionally broad and inclusive language was designed to include any work that showed originality—the result of decisionmaking by a creative mind.

Copying and the History of Copyright

Many people don’t realize that for most of the history of copyright, it was legal for people to make as many personal copies of books works as they wanted, as long as it was strictly for personal use. The restriction on personal copying is of relatively recent vintage, only dating back to 1915 or so.

The reason is that “copy” has multiple meanings: it is both a noun and a verb. As a verb, copy has the common meaning, “to reproduce or imitate.” As a noun, copy is “a collection of written material or a complete work.” The word is still used in the noun sense in the publishing industry: a “copy editor” is somebody who edits written material (the “copy”), not somebody who manages the reproduction of content.

In its original sense, copyright was a publishing right. Only somebody who had rights over the work as a whole (the “copy” as a noun) was able to publish and distribute the work. Individual use was not even addressed; if a person wanted to copy (the verb) an entire book, he or she was free to do so. Individual reproduction was not an economic threat to content publishing because it didn’t scale.

Section two of the Copyright Act of 1831 makes this clear: those granted a copyright had “the sole right and liberty of printing, reprinting, publishing, and vending” the work (in other words, all publishing-related rights). Obviously, reproduction (copy as a verb) was required for publication of the work (copy as a noun), but they were two different things. This small distinction is important to understanding the state of copyright today.

Copying (the verb) and copyright

When paintings and statues were added to the list of works that could be copyrighted, there was some concern as to how the copyright on something like a statue might be infringed. Statues couldn’t be mechanically reproduced and “published” like books.

The problem was that a second artist could get around the exclusive rights granted under the law by creating a new work that was for all intents and purposes a copy—a republication—of the existing copyrighted work. To prevent this sort of gaming of the system, Congress inserted the word “copy” (verb sense) into the Copyright Act of 1870 as a specific protection against the violation of the rights of artistic reproduction. Figure 4-1 is from the Restated Copyright Act of 1874.

Figure 4-1. Copyright eligibility in 1870

A contemporary reading of this passage would suggest that all copying would be prohibited. However, there were different penalties imposed for the infringement of books and the infringement of other artistic works. Significantly, copying was not listed as a trigger for the infringement of books. Therefore, the turn-of-the-century understanding of copyright law was that artistic works could not be copied, but there was no limitation on the private copying of books—only on publication.

More specifically, only copies of books that were sold in competition to the publisher were considered to infringe the copyright. Figure 4-2 shows an excerpt from an influential copyright treatise at the time:^[2]

Figure 4-2. 1902 excerpt on copyright infringement of books

In contrast, the rights reserved to authors and creators of artistic works were much broader. Figure 4-3 shows the difference.

Figure 4-3. 1902 copyright infringement of artistic works

The significant difference between these two passages is that all copying of artistic works was explicitly forbidden, but there was an implicit acceptance of private, personal-use copies of books and other literary materials.

This ignores state common law copyright, which applied to works immediately upon fixation (and sometimes even before). Nevertheless, publication terminated all common law rights, and after publication, the work became either public domain or federally copyrighted and governed by the law quoted above.

The Copyright Act of 1909

This changed with the Copyright Act of 1909. The 1909 act was the first step toward the “protectable by default” standard described earlier, and as such, it was much more expansive in its language when it described what would be considered a “copyrightable work.” Again from the act:

The works for which copyright may be secured under this Act shall include all the writings of an author.
[The] application for registration shall specify to which of the following classes the work in which copyright is claimed belongs: Books, ... Periodicals, ... Works of art, ... Photographs, ...

Notice that books, periodicals, works of art, and other sorts of works were all included under the same “copyrighted work” umbrella. Further, all copyrighted works received the same basic protection; they were subject to the copyright holder’s exclusive right “to print, reprint, publish, copy, and vend the copyrighted work” (Copyright Act of 1909, Section 1a, emphasis added).

In one stroke, the creation of a single base standard for copyrighted works reserved to copyright holders the right to restrict all copies of literary works, even those made exclusively for personal, unpublished use.

Stepping aside from the history for a moment, I noted before that this situation is not too different from designing a security policy or writing a regular expression. Having personally made the mistake of being overinclusive in those other contexts, it is my personal opinion that the 1909 prohibition on private copying of literary works was a mistake. Not necessarily a mistake in the sense that “they should not have done that,” but rather a mistake in the sense that it was an unintended extension of the law.

It is possible that the expansion of the prohibition on personal-use copying was an intended consequence of the 1909 act, but there is no discussion in the Congressional Record about that change. Instead, the discussions were focused on the simplification of the statute and the mechanical reproduction of music, specifically, on reversing the White-Smith decision about piano rolls.

Regardless of whether it was a mistake, however, the language of the statute made copying in all contexts subject to the restrictions of copyright. In the 1917 publication of A Treatise on the Law of Copyright and Literary Property (American Law Book Co.), William Benjamin Hale noted that, “Strictly, even a single copy made for private use is an infringement.” By the mid-1920s, restrictions on personal copying were regularly upheld by the courts.

Copying and software

The restriction on personal-use copying of books is essential to the copyright protection of software today. Software is copyrighted as a literary work, in the same category as books. There is no restriction, even today, on reading or using a copyrighted work.

In the computer world, the analogue to reading is executing a program. As a result, there is no restriction whatsoever in copyright law on executing a program written and copyrighted by someone else. However, to read or execute something on a computer, you must copy it. Copying, in fact, is one of the most fundamental operations of a computer.

For example, imagine you are using your web browser to read something on the Internet. The text you are reading had copyright applied to it when it hit the disk, or maybe even the RAM, on the author’s computer. Then a copy of that information was brought into memory and sent over the network to the web server. The web server put a copy in RAM and then another copy on disk. When you asked for a copy of the HTML file, the web server copied the information into RAM again, sent another copy over the network (creating intermediate copies in caching servers) until it got to your computer. Your computer made a copy in RAM, maybe cached a copy on the disk, and then sent another copy to the video memory, where it finally shows up for you to read.

The copyright statute tries to deal with this issue by allowing “the owner of a copy of a computer program to make or authorize the making of another copy or adaptation of that computer program provided...that such a new copy or adaptation is created as an essential step in the utilization of the computer program in conjunction with a machine and that it is used in no other manner.” Nevertheless, some companies (and courts) have used the existence of these various copying mechanisms to apply copyright protections to the running of software.

For example, Vivendi Universal and Blizzard software have sued a company called MDY to prevent the distribution of a program that automates certain aspects of Blizzard’s World of Warcraft game. Blizzard argues that its license agreement prohibits the use of unapproved software connected to the game. Under this interpretation, any use of MDY’s program violates the license, because making a copy of the game residing in RAM is an infringement on Blizzard’s copyrights. This case is still in the courts and will probably be decided sometime in 2008.

The Terms of Copyright

So with the context of history, it is time to start looking at the terms in the law to see how they are applied today. The basic rules for copyright are set forth in Title 17 of the U. S. Code, which is largely still based on the Copyright Act of 1976.

The Copyright Act gives protection to “original works of authorship fixed in any tangible medium of expression, now known or later developed, from which they can be perceived, reproduced, or otherwise communicated, either directly or with the aid of a machine or device.”

Works of authorship include:

Literary works
Musical works, including any accompanying words
Dramatic works, including any accompanying music
Pantomimes and choreographic works
Pictorial, graphic, and sculptural works
Motion pictures and other audiovisual works
Sound recordings
Architectural works

These categories are flexible. As noted above, software is considered a literary work. Unless you are sure that some type of expression is outside the bounds of copyright, then you should assume that copyright applies.

Defining “Expression”

The first core principle of copyright is that it applies only to an expression. So, what is an expression?

In math, an expression is a combination of symbols (numbers, operators, and variables) that can be evaluated.
In a computer context, an expression can also refer to some representation of a value or something that can be evaluated to return a value.
In language, an expression is a communication in speech or writing.

The common thread between these (and other) definitions is that they all make reference to something concrete, usually a specific sequence of words or symbols. Different symbols are used to illustrate a particular thought or convey a particular idea.

Broadening this definition, an expression is any artifact used to convey an idea. The artifact may be ink on a paper, code in a file, paint on a canvas, or words on a page; the important aspect is that concrete physical representation communicates an idea from one person to another.

By way of contrast, a patent covers the idea behind an invention. The idea may be embodied in many different machines, processes, or products, each of which is a concrete expression of the patented concept. This is what makes patents so powerful—many different things can be covered by the description in a single patent.

On the other hand, 1+3, 3+1, and 2+2 are all different expressions that happen to evaluate equally. If these could be copyrighted (they can’t), all three of these expressions would have independent copyright status.

Ideas and Expressions

The distinction between concepts (say, “a series of numbers that sum to four”) and a particular expression (“2+2”) is called the idea/expression dichotomy. This principle is at the core of copyright law precisely because copyright covers personal expression and patents cover technical knowledge.

Judging technical expression

When we are judging technical knowledge, there are very basic criteria for deciding whether something has any worth. Does it work? Does it work better than existing solutions in some way? Is it feasible? Anyone with some knowledge of the field can try to answer these questions in a reasonable and objective way.

It is also possible to define technical concepts precisely enough that a suitably skilled person can look at a particular machine and understand whether it expresses a particular idea. That is the patent law in a nutshell; the patent document defines an idea with specificity (in theory), and patent suits reflect an objective mapping between the patent and a particular system (again, in theory). That is why the different types of machines described in a single patent are called embodiments—they each individually concretely express or “embody” some aspect of the inventive concept.

Judging personal expression

On the other hand, it is much more difficult to judge personal expressions. What is great art? What is a good book? What is a funny joke? As the saying goes, you can put these questions to three people and get four opinions. Complicating the situation further are the many examples of personal expression that were not considered noteworthy when they were made, but were later recognized as great art. There is no objective way to classify personal expressions into those that are good and bad, so there is no way to design a law that would allow for copyright protection only of good expression.

Further, there is no platonic ideal expression of a particular concept. Many people can have the same idea and express it in different ways. For example, which banana in Figure 4-4 is the true banana? Which is the best expression of the idea “banana?”

Figure 4-4. Going bananas

Ideas, expressions, and Turing machines

Although software has a functional, technical component, the idea/expression dichotomy also applies to code. Even when an idea is almost purely functional, there are different ways in which the code can be expressed.

To understand the idea/expression dichotomy in a software context, think of the Turing machine. A Turing machine is a simple machine with just two parts: a “tape” that can store symbols, and a “head” that moves back and forth across the tape, reading and writing symbols. Alan Turing, one of the pioneers of computer science, proved that this simple mechanical device is computationally equivalent to the computers we use each day. That means that although the Turing machine wouldn’t necessarily be very fast, it is possible to make it run all the same programs as the newest Intel microchip.

Many people have created Turing machines, both in software and with real heads and tapes, but the fundamental breakthrough was more than the physical machine—it was the idea of the Turing machine. This Universal Turing Machine, as the hypothetical machine is sometimes called, has an infinite tape and can take an infinite amount of time to run that tape. It exists only as a platonic ideal.

One way in which this Universal Turing Machine has affected computer science is the idea of Turing completeness. A language or device is Turing complete if it can express the same range of computations as a Universal Turing Machine.

If the Turing machine were patented, all computer languages (as well as a number of file formats and logic games) would be covered under the patent, as they are all Turing complete—an alternative expression of the fundamental idea of computing as expressed by the Universal Turing Machine.

Under the copyright law, however, the Universal Turing Machine as a concept would not be protectable. Instead, only your particular implementation of a Turing machine could be protected by the law. Another person could create his own implementation of the idea, as long as he didn’t just take yours.

The merger doctrine

One result of the idea/expression dichotomy is that there is no copyright protection in basic facts. This is known as the merger doctrine. Under the merger doctrine, courts will not protect a copyrighted work from infringement if the idea underlying the work can be expressed only in one or a limited number of ways. In such an instance, the idea and expression are said to merge; you cannot separate the idea from the way in which it is expressed.

The rationale for this rule is straightforward. It would be very difficult, for example, to have a discussion about general relativity without mentioning the equation e=mc². It doesn’t matter that e=mc² took a long time to develop, and it doesn’t matter that creative problem solving was required to discover it. Copyright is designed to protect original creative expression, and mere recitation of facts is not creative. Following the same principle, collections of facts are not generally copyrightable. A written list of national capitals is no more creative than a list of numbers.

Similarly, it is not possible to copyright expressions consisting of just a few words or a short phrase. For example, there is really only one way to say “Good morning,” even if you made an original, creative decision to speak those particular words. The courts have decided that it would be against public policy (not to mention foolish and unenforceable) to grant copyright control over simple phrases to just a few people.

Scenes à faire

A similar doctrine applies when examining common elements across a genre. A good example is the use of elves in fantasy literature. J.R.R. Tolkien set the archetype with the publication of The Hobbit and The Lord of the Rings, and elves have been a standard non-human character type ever since. Tolkien’s elves were tall, thin, and long-lived; these characteristics were carried forward in other books and in role-playing and computer games.

The French phrase scenes à faire (“the scenes to be made”) is used to describe these recurring story elements in copyright law. These story elements are considered part of the idea of a fantasy story and as such are not copyrightable. Individual descriptions of particular characters are protectable, but the existence of elves and their general characteristics are not.

This has become a particularly important topic with the rise of the World Wide Web. “FanFic” (fan fiction) stories take an existing world and characters, (like the world of Harry Potter) and use the characters and circumstances of that world to create new stories. In the case of fan fiction, the boundary between copyrightable expression and scenes à faire is unclear, at least in part because it is unclear whether a “Harry Potter story” is a genre. Most courts have held that a character is independently copyrightable if the character is distinctly delineated. Generic or undeveloped characters are not protected. “Distinctly delineated,” though, is a term that still has to be more clearly defined.

Mostly functional expression

The merger doctrine also applies in cases where there is a very thin line between functional expression and creative expression. The copyright statute explicitly excludes from protection “any idea, procedure, process, system, method of operation, concept, principle, or discovery.” When there are both creative and functional aspects to a work, copyright “is limited to those aspects of the work—termed ‘expression’—that display the stamp of the author’s originality.”

This is the awkward middle ground inhabited by software. Anyone who has worked extensively with code knows that there are both expressive and functional aspects to any codebase, but it is frequently difficult to separate the two.

The law in this case is unsettled, at least in part because the expressiveness in code is hard for many judges to appreciate. For example, in Sega v. Accolade (1992), the court said:

Computer programs pose unique problems for the application of the “idea/expression distinction” that determines the extent of copyright protection. To the extent that there are many possible ways of accomplishing a given task or fulfilling a particular market demand, the programmer’s choice of program structure and design may be highly creative and idiosyncratic. However, computer programs are, in essence, utilitarian articles - articles that accomplish tasks. As such, they contain many logical, structural, and visual display elements that are dictated by external factors such as compatibility requirements and industry demands...
In some circumstances, even the exact set of commands used by the programmer is deemed functional rather than creative for the purposes of copyright. When specific instructions, even though previously copyrighted, are the only and essential means of accomplishing a given task, their later use by another will not amount to infringement.

In fact, some people argue that many pieces of code are not expressive at all, or if they are expressive, they are expressive in the sense that they capture a piece of functionality in a particularly elegant way. However, technical elegance is different from personal expression, and doesn’t trigger the application of copyright.

One particularly difficult issue concerns copyright protection of header files. An individual name or symbol in a header file is cannot be copyrighted, but the particular selection of symbols may be. The selection of symbols to be exported is inherent to API design, which could be a copyrightable creative decision. If you have ever heard programmers talking about “beautiful” or “ugly” APIs, you have wandered into the tricky middle ground of barely copyrightable expression.

In my opinion, the most likely scenario is a case-by-case determination. A bunch of constants would probably not support copyright. The entirety of an exported API is more likely copyrightable, but the header files would probably only support a “thin” copyright, where even trivial changes would be enough to avoid infringement. No one really knows, though, because the law of copyright is changing day by day.

For header files in particular, the rationale for copyright protection is substantially weakened when there is a second compatible implementation of a library. In that case, the headers, as creative as they might be, would probably merge into the functional interface supported by many concrete implementations.

For example, take the C++ standard template library (STL). There are a number of different implementations of the STL, all copyrighted by different authors. The STL headers themselves, however, have become just functional descriptions of the underlying copyrighted implementations. All STL implementations must use identical function definitions in the header files, or they would not be source-compatible with each other.

The issue of header files also arises in the context of GPL-licensed software. Chapter 12 discusses this issue in more detail.

Fixation

An idea must have a concrete expression to be protectable. The copyright statute is fairly literal in this regard, saying that copyright protects “original works of authorship fixed in a tangible medium of expression.” Fixation is the legal term used to refer to the necessity that an expression must be grounded in a physical object.

A work is fixed in a tangible medium of expression when it is written down, recorded, or otherwise made permanent enough that it can be perceived and reproduced by other people. It doesn’t matter where or in what medium the expression is fixed; as long as it is recorded somewhere, copyright protection applies.

For those wondering about the copyright notices at the end of ball games, game performances can be copyrighted because they are recorded at the studio at the same time they are transmitted over the air. If the performance was broadcast live and never recorded, no copyright protection would apply. Modulating the radio waves is not enough; the artifacts protected by copyright must be tangible and not transitory.

One open question has to do with the applicability of copyright law to copies in RAM. Some people argue that the contents of RAM are not permanent enough to allow for copyright protection to apply, but the current trend is toward allowing RAM copies to count as fixations for copyright purposes.

Because the fixation requirement applies to both the initial establishment of copyright and the later determination of when infringing “copies” are made, RAM copies have become increasingly important. As noted earlier, almost every operation in a computer results in a copy being made; these temporary RAM files may be both copyright protected and infringing.

Originality

Because copyright is designed to provide incentives for the creation of new works, the law requires that the work be new. In patent law, this requirement is called novelty; in copyright law, it is called originality.

Therefore, originality/novelty is one principle common to both copyrights and patents, but its application in copyright serves only to underline the fundamental difference between these two branches of law.

In particular, patent law requires that an invention be absolutely novel. In theory, each new patent should describe a system or method that has never existed before in any form on Earth. Even though the patent system isn’t ironclad, and everybody admits that many non-novel patents make it through the patent office, the PTO at least expends effort to try to find any systems, methods, or machines that are essentially equivalent to your patent claims. It does not matter if the prior system was properly appreciated or understood by its creator; if an equivalent prior system existed, your invention cannot be patented.

On the other hand, copyright originality is all about the specific decisions that you made when you were fixing the work, i.e. moving the concept from an idea to an expression. The end result may be exactly the same as another work made by someone else, but if the work is the result of an original, creative decision process, then you have sufficient originality to claim copyright protection on the work.

Original copies

Although it does happen sometimes, it is usually difficult to come up with the exact same expression as another person. For that reason, duplicates or near-duplicates of someone else’s work are presumed to result from copying. However, if you can prove that you did not have access to the other work or that you went through a different decision process in creating your work, then your identical (or nearly identical) work will be considered original. This is essentially the strategy used in reverse engineering (discussed in Chapter 13).

Minimal originality

Although copyright requires that an original decision be made when expressing the work, it is hard to underestimate the amount of actual originality required before copyright kicks in. For example, I have a map of the world that was stitched together from different satellite passes. Deciding which satellite photos to use was the original decision that permitted the mapmaker to copyright the map.

Another example is the famous musical piece “4′33″. The composer, John Cage, made the single original decision not to play any music for the duration of the piece. This was enough to establish copyright in the piece.

An unusual epilogue to the story of 4’33” came in July 2002. The descendants of John Cage sued composer Mike Batt for copyright infringement when he released the track “A Minute’s Silence” and credited “Batt/Cage.” Batt defended his originality, saying, “I certainly wasn’t quoting his silence. I claim my silence is original silence...it’s digital. [Cage’s silence] is only analog.” The case was settled a few months later; both parties refused to comment on whether Batt’s silence was original or not.

Compilations

Compilation copyrights are a special type of copyrighted work. They are created when a person collects or assembles “preexisting materials or...data...in such a way that the resulting work as a whole constitutes an original work of authorship.”

This is a tricky area because in some cases there is very little distinction between a collection of facts (not copyrightable) and a compilation (copyrightable). Compilations of facts, such as a phone book or a map, must contain evidence of some creative spark to qualify for copyright protection. For example, phone books are not considered original enough to merit copyright protection, even though they cost a lot of time and money to make. This is because they only contain collections of facts (not original in themselves) and their alphabetical organization is mechanical, not original. Originality requires creativity; alphabetization is not creative.

In practice, almost any work that is created by an author will meet the originality requirement. Unless the information is organized purely mechanically, such as by alphabetization, the selection of the individual facts is frequently enough to support a copyright claim.

Even when the arrangement of information is purely mechanical, there are things that can instill copyright. For example, map and phone book publishers insert a small amount of made-up information into the listings—non-existent addresses with fictitious names. This small amount of original creativity is enough to support a copyright on the whole work, even though the individual facts in the book are not by themselves protectable.

The compilation copyright is separate from any copyright protections that may apply to the individual works in the compilation. For example, a collection of the best of Shakespeare’s plays might be subject to copyright even though the individual plays would be old enough to have no protection; the selection and organization of the plays could constitute enough creativity to support a compilation copyright claim if the choosing was not purely mechanical.

This is the same principle that allows Linux distributors such as Red Hat to claim copyright protection on their installation disks. Even if Red Hat did not provide any of the packages on the disk directly, Red Hat would still have a copyright on the compilation of programs chosen for the installation disk.

Copyright protection of forms and databases

One current issue in copyright law is the protection of individual forms. The rule right now is that blank forms are not copyrightable. There may be individual bits of the form that could be protected, like a logo, but the basic form fields themselves are not eligible for copyright.

The “blank forms” rule is important to software developers because of the clear analogue between blank forms and all sorts of schema. For example, a database schema is very similar to an electronic representation of a paper form. An XML schema is used similarly to describe the structure of other data. The ability to copyright bare schemas is an open issue that needs to be resolved.

The blank forms doctrine comes from an old Supreme Court case called Baker v. Selden. As sometimes happens in these cases, the important aspect of that case was not the final determination of who won or lost, but a passing statement made by the court in the discussion of the case: “blank account-books are not the subject of copyright.” This statement was expanded over time into the current rule, “blank forms such as time cards, graph paper, account books, diaries, blank checks, scorecards, address books, report forms, order forms and the like, which are designed for recording information and do not in themselves convey information,” are not protectable by copyright.

However, the real-world application of this rule is much more complicated. Some forms have been denied copyright protection because of the blank forms rule. However, a number of court cases have found some protectable elements in forms, particularly in collections of forms. Instead of automatically denying copyright to blank forms, the court will determine whether the form is sufficiently original to qualify for a compilation copyright. If the court finds that the arrangement of headings and selection of sentences meets the originality requirement, the form will be copyrightable (and copyrighted).

Further muddling the issue, the Copyright Office also regularly grants copyright registrations for forms. As explained by the office (http://www.copyright.gov/), “there is no way to secure copyright protection for the idea or principle behind a blank form...[but] an original literary work...is subject to copyright registration even though it is published in conjunction with a blank form...not protected by copyright.” However, the copyright registration does not say, “the logo on this form is copyrighted.” Instead, it says that copyright applies to the form without specifically designating subparts that are protectable. As a result, the copyright status of any particular form is very unclear.

With regard to populated databases, copyright protection has been granted to collections that display original creative expression. For example, a database containing information about the best restaurants in a city can be protected by copyright because the particular selection of restaurants is guided by the subjective and original criterion of being the “best.” Therefore, the populated database can receive copyright protection even if all of the facts contained within the database (addresses, phone numbers, etc.) are individually unprotectable.

The Copyright Term

As with patents, the patchwork of laws affecting copyright has made it somewhat complicated to figure out the copyright status of some works. For this issue, there are three different laws to consider: the Copyright Act of 1909, the Copyright Act of 1976 (which went into force on January 1, 1978), and the Sonny Bono Copyright Term Extension Act of 1998. The duration of a copyright depends on when the copyrighted work was first created, when it was first published, and where the work was created. For our purposes, we will assume that all works were created in the United States.

Starting from the oldest works, anything that was registered or published before 1923 is in the public domain. These works can be used by anyone for any purpose; their use cannot be controlled, despite the fact that some publishers put copyright symbols on reproductions of public domain works.

If a work was published between 1923 and 1963, a two-term system was applied, with a renewal required 28 years after the initial registration to maintain the copyright. If the copyright owner did not apply for copyright renewal, the copyright expired and these works are now in the public domain. If the copyright owner did renew the copyright registration, these works had their terms automatically extended to 95 years. These works will enter into the public domain no sooner that 2018 (95 years from 1923).

If a work was published between 1964 and 1977, there is no renewal requirement. These works will automatically have a 95-year term and will enter the public domain no sooner than 2059.

The Copyright Act of 1976 established new, much longer copyright terms for all new works. For all works fixed on or after January 1, 1978, the copyright term extends for the author’s life plus an additional 70 years after the author’s death. If it is a joint work (a work made by more than one author), the term extends to 70 years after the last surviving author’s death. For works for hire (works created in the course of employment) or anonymous/pseudonymous works, the copyright extends for 95 years from publication or 120 years from creation, whichever is shorter.

If a work was created before 1978, but not published, it was automatically given copyright protection under the Copyright Act of 1976. The copyright term for these works is computed based on the same life-plus-70-year or 95/120-year terms applied to works created after 1978.

The law further specifies that for unpublished works created before 1978, the copyright extended at least until December 31, 2002. For works that were formerly unpublished but were later published before January 1, 2003, the term of copyright will not expire before December 31, 2047.

Owning a Copyright

By default, authors own their copyrights. This may seem natural now, but it was actually a substantial change made in the 1976 law; under all previous iterations of the Copyright Act the copyright was owned by the publisher (remember, this was a publishing right). In most situations under the current law, you own the rights to your personal creative expression.

One important distinction is that owning a reproduction of someone’s expression (the result of the verb “copy”) doesn’t give any rights to the underlying creative expression (the noun “copy”). For example, if you own a CD of the Muppets’ Greatest Hits, your ownership is limited to the particular piece of plastic that you bought in the store. It so happens that the piece of plastic you own can be read in a certain way to reproduce the Muppets’ music, but your rights over the music are limited to reading the disc.

Unpublished Works

When the 1976 Copyright Act flipped the default from “works are not copyrighted unless they have been registered” to “works are copyrighted unless an exception applies,” it created a broad new category of copyrighted works: unpublished copyrighted works. Under previous Copyright acts, there was no concept of copyright for an unpublished work, because copyright applied only to published works.

Unpublished copyrighted works are by far the most common category. If you doodle in school, you create an unpublished copyrighted work. If you take notes in a meeting, you create an unpublished copyrighted work. If you sing “Happy Birthday” for the video camera at a party, you create an unpublished copyrighted work. Just as the Turing machine proceeds through its computations leaving symbols on its tape, so too, do we move through life leaving behind us a trail of miniscule copyrighted artifacts.

Joint Authorship

Although we have talked about personal expression so far, more than one person can be considered the author of a work. When several people work together to create a single work, the result may be a joint work. According to the Copyright Act, a joint work is “a work prepared by two or more authors with the intention that their contributions be merged into inseparable or interdependent parts of a unitary whole.” This is clearly applicable to many computer programs and other copyrightable works created by programmers.

The important aspect of a joint work is that all authors intend their contributions to be considered together. It doesn’t matter if one person contributed more to the work, or that the many authors contributed to different parts of the work. For example, musical groups usually create joint works when they work together on an album. It doesn’t matter that there may be more drum work in one song and more vocals in another. It also doesn’t matter if they are even in the studio at the same time. As long as all parties worked with the intention of creating the disc together, they are joint authors of the final work.

In a software context, the result of pair programming would usually be considered a joint work. This is in contrast to code written by one programmer and later patched by another. In that case, the original author did not have the intention to create a joint work, even though the second author did. The result of this second situation is called a derivative work, and will be discussed later in this chapter.

Joint authorship matters because joint authors get equal rights to the final work; each author can use or license the work without permission from any other author. This is in contrast to derivative works, where the original author has superior rights over the work and can control the distribution of the later work.

Works for Hire

One of the most commercially important classifications under copyright law is the category of works for hire. The majority of works in the commercial market contain at least some element that was created as a work for hire.

A work for hire is created when someone, either a person or a business, directs another person to create an original work in the course of employment or under the direction of the first person. For example, when you write code for your work, you are creating a work for hire. In this situation, the author of the work is not the person who created the work, but instead the person who directed the creation of the work.

Think again about Monty Python and the Holy Grail. This movie is the result of many people working together, but it is not a joint work. Instead, it is owned by the producers of the film—the people who paid all the other workers to come together for the writing, filming, and editing that went into the final result. (As an aside, that is why the awards for “Best Picture” go to the producers of a film; they are the legal “authors” of the picture.)

The Copyright Act recognizes a work for hire in two specific situations:

The work is created by an employee acting within the scope of his or her employment.
The work is specially ordered or commissioned for use:
- as a contribution to a collective work
- as a part of a motion picture or other audiovisual work
- as a translation
- as a supplement to another work
- as a compilation
- as an instructional text
- as a test or as answer material for a test, or
- as an atlas

The first situation applies to ordinary employment; if you are an employee, anything you create in the ordinary course of your job is a work for hire. The only wrinkle is whether somebody is actually an “employee.” In general, if your boss determines your schedule, directs how you do your work, provides the equipment for your use, and pays taxes for employing you, you are an employee. Anything you create belongs to your employer.

If some or all of these don’t apply to you, however, you may be in the second situation. In that case, the law tries to protect authors by making it significantly more difficult to have something considered a work for hire. In that case, works for hire must be specially ordered or commissioned and they must come within one of the categories listed above. Even then, a work will be a work for hire only if a written agreement says that the work will be a work for hire.

Implied licenses and ownership of works for hire

Figuring out whether something is a work for hire is important because it determines who can use and control the work after it is done. When you hire a wedding photographer, for example, the default is that the photographer is the author of the work and retains the copyright even though you paid him (or her) to take the pictures. As a result, the photographer retains the film negatives (or the digital files, these days) and each copy of the pictures must be individually purchased.

This is an especially important issue for software companies. If a company hires non-employee independent contractors to create an essential piece of infrastructure, then, by default, those independent developers are the owners of the software. There is an implied license for the company to use a copy of the software, but the developers have the legal right to do what they like with any other copies, including selling them to a competitor. Further, the purchasing company may not have any rights to build on (or even fix bugs in) the software.

Further, the law doesn’t explicitly say that software can be created as a work for hire. Even if the software is specially ordered or commissioned, it doesn’t easily fit within any of the allowed categories. While it is unlikely that a court would decide that software couldn’t be considered a work for hire, especially if there was a written agreement signed by the independent developers, this issue has not been decided for certain.

Accordingly, most employee contracts and development agreements take a belt-and-suspenders approach: employees are required to sign agreements specifying that all their works are works for hire, and promising to transfer their copyrights to the employer if a court ever decides that their work product doesn’t count as a work for hire.

Copyright Formalities

Before the Copyright Act of 1976, publishers had to comply with certain requirements to have copyright protection applied to a work. They had to register new works, send copies of the work to the Library of Congress, and include the copyright symbol (©) and year of publication on each copy of the work. While it is still a good idea to include the copyright symbol and year on each copy, it is no longer necessary. It is necessary to formally register your copyrights only if you are going to sue someone for infringement.

The Rights Granted by Copyright

Copyright, like other forms of intellectual property, reserves to its owners certain exclusive rights. In the case of copyright, Section 106 of the Copyright Act grants the owners of copyrights the exclusive right to do (or allow someone else to do) the following:

Reproduce the copyrighted work
Prepare derivative works based upon the work
Distribute copies of the work to the public
Perform the copyrighted work publicly
Display the copyrighted work publicly

Reproducing a Work

The first and (now) most fundamental reserved right is the right to reproduce a copyrighted work. Some people believe that any private copying within a home is acceptable under copyright law. Under the strict terms of the statute, there is no provision for copying of any kind, private or not. This is why, for example, the RIAA (the Recording Industry Association of America) argues that putting MP3s on your iPod is a copyright violation. In their view, any copying must be explicitly authorized.

This argument does not hold up in many circumstances because of fair use (which we will discuss further below), but there is support for their position that any copying infringes their copyrights.

You may occasionally see the terms Phonorecord“ and “sound recording” applied to copyrighted works. A Phonorecord“ is the physical object that embodies a work of authorship, like a CD, DVD, or hard drive. According to the Copyright Act, a “sound recording” is any work that results “from the fixation of a series of musical, spoken, or other sounds, but not including the sounds accompanying a motion picture or other audiovisual work.” Phonorecords usually (but do not necessarily) contain sound recordings.

Preparing Derivative Works

One of the more important reserved rights under copyright law is making derivative works. According to the Copyright Act, a derivative work is:

...a work based upon one or more preexisting works, such as a translation, musical arrangement, dramatization, fictionalization, motion picture version, sound recording, art reproduction, abridgment, condensation, or any other form in which a work may be recast, transformed, or adapted.

The rule about derivative works applies when there is partial copying, transformation, or adaptation of a copyrighted source. For example, software development usually involves the creation of a long chain of derivative works; everything after the initial check-in of the code creates a new derivative work. Most parts of the codebase don’t change, but each new patch creates a new derivative work.

This copying doesn’t have to be literal. For example, the legal difficulties of FanFic arise because they necessarily involve copying some of the names, characters, and situations from the original copyrighted works, even if the stories themselves are otherwise completely original.

Further muddying the waters are de minimis changes—changes that are so minimal that the changed work should be considered identical to the original work. De minimis changes can actually be quite extensive if they don’t involve any originality. For example, if a person republished a copy of a Harry Potter novel with all instances of “Harry Potter” replaced by “Parry Hotter,” the textual changes would be substantial, but the total effect would be a de minimis change.

The right to prepare derivative works should be understood as an anti-gaming provision of the copyright law. Before the law included this provision, a number of people tried to profit off other people’s works by selling derivative works that had just enough changes to evade the earlier, more literal copyright provisions. One good example of this was the creation of piano rolls from copyrighted musical works. Tired of this sort of gaming of the system, Congress included rights over derivative works in the 1976 Act.

Derivative works are difficult because they represent an amalgam of different and supposedly exclusive rights. A derivative work results from a transformation or adaptation of an original work, and that transformation is itself an original copyrightable (and copyrighted) contribution. Therefore, both the original author and the transformer have independent copyright interests in the work. The original author has an overriding right to veto the distribution of the derivative work, but any subsequent distributor must receive permission from both the original author and the transformative author.

The copyright complexity of open source software systems is in large part due to the rules surrounding derivative works. A large project like the Linux kernel has hundreds or thousands of authors. It is essentially impossible to figure out which patches were purely functional (and thus not copyrighted), which patches were de minimus (not affecting copyright status), and which patches were new and original (resulting in a derivative work). As a result, nobody really owns the Linux kernel; the best description of its status is that it is owned jointly by its developers.

One consequence of this fact is that it would be very difficult to move the kernel to a new license, even if Linus Torvalds or the bulk of developers decided they wanted to. This is not a problem with FSF software because it used a more disciplined approach to gathering copyrights. The centralization of copyrights is discussed further in Chapter 11 and Chapter 14.

Originality and derivative works

For a mixed work to qualify as a derivative work, the new portion must have enough originality to qualify as a copyrightable work itself; adding de minimis or purely functional expressions to a copyrighted work does not create a derivative work. Similarly, copying de minimis or purely functional expressions from an otherwise copyrighted work does not create a derivative work.

For example, copying most of a “Hello World” program into your own “Goodbye Cruel World” program would not make your “Goodbye” code a derivative work of the “Hello” code, even though there was substantial literal copying between the two programs. Especially in small codebases, there are only a few ways to express certain concepts, and the expressions may be constrained by the functional requirements of the program. Larger programs, of course, offer a wider variety of expression and so direct copying of code is more likely to create a derivative work.

Non-literal copying is governed by the abstraction-filtration-comparison test. This test works by abstracting the structure of a program from the specific syntax used to express the program. Elements that cannot be copyrighted, such as purely functional or public domain algorithms, are filtered out, and the remaining original expressive elements are compared to the supposedly infringing work.

In theory, the abstraction-filtration-comparison test elegantly isolates copyrightable expression from the rest of the code. In practice, it usually provides highly idiosyncratic and unrepeatable results.

Distributing a Work

One of the exclusive rights granted under the 1976 act is the right to distribute (or control distribution of) a work. This was a broadening of the publishing rights associated with copyright. The 1909 act granted the exclusive rights “to print, reprint, publish, copy, and vend the copyrighted work.” The 1976 act went further by reserving to copyright holders the right to “distribute” the work in any fashion.

This subtle broadening of language has had a direct impact on the new world of peer-to-peer software. In an April 2008 decision, one court held that having a shared files folder on a computer and thereby making files available for distribution is sufficient to infringe the exclusive rights of distribution granted under copyright law. Thus, not only the transfer of a copy, but also the intent to transfer or displayed invitation to transfer a copy can violate the exclusive right of distribution. The law is still unsettled, though. It remains to be seen whether other courts will decide similarly.

The first sale doctrine

One limitation on the exclusive right of distribution is that it only applies the first time a particular copy is sold. For example, think of a book. The copyright holder, probably the author, is able to dictate who gets to publish the book and how much the book costs to buy. Once a reader has bought a copy of the book, however, the copyright holder’s exclusive control over that copy is exhausted. The reader is then able to keep the book, give it away, resell the book, or bury it in his backyard. The first sale doctrine allows the development of used bookstores, libraries, and the sale of kitschy memorabilia on eBay.

Nevertheless, the first sale doctrine does not apply if there is anything less than a complete sale. Software companies usually do not sell copies of their software; they only license them (allow their use). These license terms can be (and usually are) more restrictive than the default rules of copyright in this regard. As a result, there isn’t a significant resale market in software like there is in books. For example, if you give a friend an install disk for Microsoft Windows, he may not be legally allowed to install it, even if you delete your copy and let him reregister with the same license number. Again, however, the law may be changing. A June 2008 decision stated that some shrinkwrapped software “licenses” are actually sales, subject to the first sale doctrine.

Performance or Display of a Work

The public performance and display rights allow a copyright holder to control when a work is performed publicly. A public performance occurs when the work is displayed or performed in a place open to the public or when the work is transmitted to multiple locations. For example, it would be a violation of this right to rent a movie and display it in a city park or stream the movie over the Internet without permission.

These rights generally apply to software because software is a literary work. For example, public performance of a video game without permission would probably violate this right, although the parameters of public performance of software have not been well established.

Fair Use

The primary limitation on copyright owners’ control of the use of copyrighted material is a principle called fair use. In general, fair use allows the copying, distribution, and use of copyrighted material, without permission, for transformative or important purposes. Courts created the doctrine of fair use in an effort to balance the rights of copyright holders with the rights of society at large. Courts recognized there was value in allowing some copying of copyrighted material, particularly for important functions such as teaching, scholarship, and political speech. Some of the principles around fair use were finally codified as part of the Copyright Act of 1976, which gave four principles for determining whether something was a fair use:

The purpose and character of the use, including whether such use is of commercial nature or is for non-profit educational purposes
The nature of the copyrighted work
The amount and substantiality of the portion used in relation to the copyrighted work as a whole; and
The effect of the use upon the potential market for or value of the copyrighted work

Despite this seeming exactitude, it is difficult to say exactly what counts as a fair use. Both courts and Congress wanted the definition of fair use to be flexible enough to deal with new situations. The Copyright Act actually allows consideration of factors other than these, but these are the only four that are usually considered.

Most fair use is either commentary or parody. If you are commenting on a copyrighted work, for example, writing a book review, you can (in most cases) copy parts of the work so that those reading your review have the necessary context to understand it. Educational use—scholarship—is just a special case of commentary. Greater leeway is allowed in the use of copyrighted material when there is an educational or non-profit purpose.

Even in the case of a book review, however, there is no bright line test that indicates how much of a work you can copy. In one famous case, an excerpt of 200 words (out of 30,000) was not considered fair use because it revealed the essence of the entire book.

Courts have generally allowed much more substantial copying of copyrighted material in the case of parody. Parody is closely tied to the principle of free speech. We have traditions and laws that are designed to place a high value on the existence of many different kinds of expression. Courts have recognized that sometimes the most important discourse is the most cutting. As a result, they have sometimes allowed large amounts of copyrighted material to be incorporated into a parody…but not always.

The most important factor in fair use analysis is the fourth, the effect of the use upon the market (or potential market) for the original work. This factor is more important than all the others, and copyright holders can almost always make an argument that any particular use of copyrighted material can negatively affect the market, or again, a potential market, for the copyrighted work.

In one case, for example, a sculptor made a sculpture based upon a photograph from another artist. Even though the original photographer could not sculpt and therefore could not have created the sculpture, the court ruled that this was not a fair use because it negatively affected the market for authorized sculptures related to the photograph.

A Rule of Thumb

To make things simpler, the easiest way to reason about copyright is assume that any use of a copyrightable work is legally reserved to the copyright owner. That is the power of defaults at work. The control granted by copyright isn’t quite that broad, but identifying specific uses as being outside of copyright can be difficult and tricky, and the law can change under you if your application pushes the boundaries of what is acceptable.

For example, you may be familiar with the Grokster case, MGM Studios, Inc. v. Grokster, Ltd. When the Grokster peer-to-peer network was created, the established rule in copyright law was that a technology sometimes used for copyright infringement would not be prohibited if it had substantial non-infringing uses as well (the “Sony” rule, named after Sony Corp. v. Universal City Studios). The owners of the Grokster network felt that they were safe, because the underlying peer-to-peer technology was used for legitimate content, swarm distribution of material, and dissident political expression—all substantial non-infringing uses.

In the Supreme Court’s decision in this case, the court created the new doctrine that inducing copyright infringement was prohibited under the same terms as copyright infringement itself. Because Grokster encouraged and derived revenue from the massive amounts of copyright infringement happening during use of its system, Grokster itself was liable and had to shut down.

...and a bit about legal interpretation

It is unfortunate that under the current copyright law, the most accurate predictions about prospective cases usually come from borrowing from the branch of academia known as legal realism. Legal realism is a cynical interpretive strategy that sees all law in terms of political power structures; the reasoning behind individual decisions is nothing more than window dressing for underlying political biases and power struggles.

Under a legal realist analysis, any use of copyrighted material that was objectionable or questionable would be struck down as infringing. Non-objectionable use of copyrighted material would be allowed only if the political and economic interests in support of the use were more powerful than the political and economic interests against the use. Unfortunately, this is, in my opinion, the best guide to the outcome of any future copyright case.

^[2]Evan James MacGillivray, A Treatise Upon the Law of Copyright: In the United Kingdom and the United States, J. Murray, 1902, at 287–288.

Get Intellectual Property and Open Source now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial