Chapter 1. The Nuts and Bolts of MP3

In April of 1999, the term “MP3” surpassed “sex” as the most-searched-on term at some of the Internet’s top search engines—a phenomenal achievement for a complicated digital music encoding algorithm devised over the course of a decade by a few scientists and audiophiles in an obscure German laboratory.

What is it about MP3 that inspires such unprecedented levels of enthusiasm? For some, it’s the prospect of being able to store vast quantities of music on a computer’s hard drive, and to shuffle and rearrange tracks from that collection around at a moment’s notice. For others, it’s the promise of an entirely new model for the music universe—one that allows creative artists to publish their own work without the assistance of the established industry. But for millions of users, the thrill of MP3 is more simple than that: it’s the possibility of getting their hands on piles of high-quality music, free of charge.

In this chapter, we’ll get a bird’s-eye view of the format and the MP3 phenomenon: what it is, how it works, how to download and create MP3 files, and how to listen to them. Then we’ll take a look at some of the many issues surrounding MP3, including piracy, politics, digital rights, and the recording industry’s stance on the matter. Finally, we’ll examine the correlation between the MP3 and open source software movements, and find out why file-based digital music distribution is here to stay.

MP3 Basics

If you’re new to the MP3 game, you’ll want to know exactly what MP3 files are, where to get them, how they work, and how to make the most of a growing MP3 collection. As you read through this brief overview, keep in mind that these topics are covered in much greater detail elsewhere in this book.

What Is MP3?

Simply put, MP3 is an audio compression technique. Raw audio files—such as those extracted from an audio CD—are very large, consuming around 10 MB of storage space per minute. But MP3 files representing the same audio material may consume only 1 MB of space per minute while still retaining an acceptable level of quality. By drastically reducing the size of digital audio files, it has become feasible for music lovers to transfer songs over the Internet, for users to build enormous digital music collections on their hard drives, to play them back in any order at any time, and to move them around between different types of playback hardware. These possibilities have far-reaching ramifications not just for music lovers, but for artists and the recording industry as well. We’ll explore the politics and philosophical issues raised by MP3 in the second part of this chapter.

Why the term “MP3?”

“MP3” is the quick way of referring to an encoding algorithm called “MPEG-1, Layer III,” developed primarily by a German technology group called Fraunhofer and Thomson and now officially codified by the International Standards Organization, or ISO. The name, of course, corresponds to the extension found on MP3 files: After_the_Goldrush.mp3, for example. More on Fraunhofer and Co. can be found in Section 1.1.3 later in this chapter.

Small is beautiful: How MP3 works

Raw audio does not compress well via traditional techniques: if you try to zip up a WAV file, for instance, you’ll find that the resulting archive is only marginally smaller than the uncompressed original.

MP3 takes a different tack on the compression problem. Rather than just seeking out redundancies like zip does, MP3 provides a means of analyzing patterns in an audio stream and comparing them to models of human hearing and perception. Also unlike zip compression, MP3 actually discards huge amounts of information, preserving only the data absolutely necessary to reproduce an intelligible signal. The amount of data preserved is configurable by the person doing the compressing, so an optimal balance between file size and quality can be achieved. The tool or software used to achieve the compression is called an "encoder,” while the playback software is called a "decoder” or, more simply, an "MP3 player.”

By running uncompressed audio files through an MP3 encoder, files can shrink to around one-tenth of their original size, while still retaining most of their quality. By compressing a little less (to around one-eighth of the original size), MP3 quality can be virtually indistinguishable from that of the original source material. As a result, a three-minute song can be transformed into a 3 MB file, which is something most people can find room for on their hard drives, and that most web surfers can download in a reasonable time frame. In other words, a 640 MB compact disc stuffed full of MP3 files rather than uncompressed audio can store around 10-11 hours worth of music. And since DVDs store around eight times as much as compact discs, a recordable DVD could hold nearly five days worth of continuous music on a single 5” platter.

The mechanics of the MP3 codec and perceptual encoding principles can be found in Chapter 2.

Working with MP3 Files

If you know how to download files from the Internet, have a grasp of basic file management concepts, and aren’t afraid to experiment with new applications, you can probably get started on your own MP3 collection without much coaching. However, there are a lot of options and considerations to take into account, including the quality and efficiency of MP3 encoders and players, advanced features and functions, techniques used for organizing and customizing large MP3 collections, and so on. We’ve dedicated all of Chapter 3, and Chapter 4, to these topics. For now, here’s a brief tour of the basics.

Downloading MP3s

In order to start playing MP3 files, you’ll need to get your hands on some, of course. There are two ways to do this: You can either download MP3s that other people have created, or you can create them from the music you already own.

Warning

Before you start downloading MP3s, you should know that the vast majority of files available out there are distributed illegally. Many people encode music they legally own, and then make it available on the Internet to people who do not own that music, which is illegal (see Chapter 7, for more information). Whether you choose to download pirated music is a moral choice that only you can make. The wide availability of pirated music, however, should not stop you from seeking out legal MP3s. While there are far fewer of these available, you’ll be surprised by the quality of the gems you’ll find hiding out in the haystacks. A great place to find legal MP3s is MP3.com, though that site is certainly not the only source of legitimate files. If you use a commercial MP3 tool like RealJukebox (Chapter 3), you’ll probably find a button or link in the interface that will take you directly to an MP3 download site.

Finding MP3 files

While most users start out by simply typing "MP3” into their favorite search engine, that probably isn’t the most efficient way of going about things. You might want to start instead at a major site dedicated to indexing or distributing MP3 files, such as http://www.mp3.lycos.com, http://www.listen.com, http://www.scour.net, or http://www.rioport.com. Search engines can, however, be very useful for finding smaller sites run by individuals—but be prepared to encounter lots of broken links and unresponsive sites. Because many user-run sites are quickly shut down by Internet Service Providers (ISPs) under pressure from record labels, search engines often index links to sites that no longer exist.

The Web isn’t the only way to find MP3 files—you’ll also find plenty of files on FTP servers, in binary Usenet groups, and in IRC channels. Details on using these venues for MP3 downloading can be found in Chapter 3 and Chapter 4.

Note

Users looking to swap MP3 files easily with music fans all over the world may want to check out Napster (http://www.napster.com), which is a sort of combined IRC, FTP, and search client with a twist. Rather than searching the Web, you’ll be searching the hard drives of other Napster users for songs you like. Since you’ll only see files on the systems of people currently using the service, you won’t have to worry about broken links and downed servers. Log in to the Napster server, register your collection with a specific genre, and you’ll be able to search for files on other people’s systems by song name or artist. Find a song or songs you like and transfer them to your hard drive, while other people do the same with your music collection. Meanwhile, you can chat with other music lovers in the background as your transfer proceeds. Great idea, but the potential for copyright abuse inherent in this product is extreme, and none of the music we found during testing was legitimate. Nevertheless, Napster has single-handedly ushered in a whole new era of user-to-user file sharing, and has the music industry more worried than ever.

Creating your own MP3 files

Creating your own MP3s is only slightly more difficult than downloading them, but the payoff is worth it. You know for a fact that the music in your collection is the music you like, you can personally control the quality of the encodings, and you don’t have to worry about whether any of your tracks are illegal.

Encoding tracks from your CD collection is a two-step process. First, bits from an audio CD must be transferred to your system as uncompressed audio, typically as a WAV file. This extraction process is known as ripping . The uncompressed audio is then run through an MP3 encoder to create an MP3 file. However, there are dozens of tools available that take care of all the hard work behind the scenes, ripping and encoding transparently in a single step. You’ll meet a handful of ripper/encoder combination tools in Chapter 5.

Playback basics

Think of an MP3 file like any other document you might store on your computer and open in an application. You can open a document by using an application’s File Open menu, by double-clicking a document icon, or by dragging a document onto the application’s icon. MP3 files are no different, and can typically be played in any of these ways. There are hundreds of MP3 players available for virtually all operating systems, and all of them are capable of playing all MP3 files. As a user, you have tons of options when it comes to picking your tools. In Chapter 3, you’ll meet some of the most popular MP3 players available for Windows, MacOS, Linux, and BeOS, and be introduced to the fundamental principles of MP3 playback.

Playlists

One of the most liberating aspects of working with file-based music (as opposed to music stored on media such as CDs, tapes, or LPs) is the fact that you suddenly gain the ability to organize, randomize, and mix the tunes in your music collection in an infinitude of ways. If you’ve ever created custom mixed-music cassette tapes, you know how fun—and how time consuming—this can be. MP3 playlists let you enjoy the fun part while skipping the time-consuming part.

The vast majority of MP3 players include a “playlist” window or editor, into which you can drag any random collection of tracks. Any playlist can be saved for posterity, to be played again at a later date. A playlist can be as short as a single song or as long as your entire collection (some people have playlists referencing months of nonrepeating music). A playlist can reference all the music in a folder or an entire directory structure, or can be composed by querying your system for all songs matching a certain criteria. For example, you can create playlists of all country music written prior to 1965, or all of your acid jazz tracks, or all of your schmaltzy disco. Playlist creation and manipulation is covered in detail in Chapter 4.

Note

Playlists are simple text files listing references to the actual locations of MP3 files on your system or on a network. As such, they consume almost no disk space. Because playlists reference songs on your system, it is usually not useful to trade them with other users. There are, however, playlists comprised only of URLs to MP3 files on the Internet, and these will, of course, work on anyone’s system.

ID3 tags

MP3 files are capable of storing a certain amount of “meta-data”—extra information about each file—inside the file itself. Data on track title, artist, album, year, genre, and your personal comments on the track can all be stored in an MP3 file’s ID3 tags. These tags will be inserted automatically by most tools as you rip and encode, or can be added or edited later on, often directly through your MP3 player’s interface. ID3 tags become more important as your collection grows, especially when you start using database-oriented MP3 organizers, as described in Chapter 4.

Internet radio

Some people have neither the time nor the inclination to create and manage a huge MP3 collection. Fortunately, they don’t have to. Thanks to the rise of outfits like SHOUTcast (http://www.shoutcast.com) and icecast (http://www.icecast.org), thousands of users are streaming MP3 audio from their computers to the Internet at large, running live broadcasts much like a radio station. There are several key differences between MP3 downloads and MP3 streaming:

  • MP3 broadcasts aren’t saved to the listener’s hard disk, unlike MP3 downloads. When you tune in to a broadcast, the only thing that’s saved to disk is a tiny text file containing some meta-data about the broadcast in question, including the server’s address and a playlist. This file is passed to the MP3 player, which in turn receives and handles (buffers) the ongoing broadcast.

  • Broadcasts are synchronous, while downloads are asynchronous. In other words, when you tune in to a broadcast, you hear exactly what’s being played from a given server at that moment in time, just like the radio. When you download a file, you get to listen to it any time you want.

  • Because of bandwidth constraints on most listeners, broadcasts are typically of a lower fidelity than MP3 downloads. MP3 broadcast servers usually send out MP3s that have either been down-sampled to a lower frequency, encoded at a lower-than-normal bitrate, or sent as a mono rather than stereo stream.

Full details on tuning in to MP3 broadcasts can be found in Chapter 4. The process of running your own Internet radio station is described in Chapter 8.

Beyond the computer

While you’ll almost certainly create all of your MP3 files on your computer, and will most likely begin your MP3 explorations by playing them back through your computer as well, part of the magic of file-based digital audio is the flexibility. There’s no reason an MP3 file can’t be transferred to any device that includes a storage and playback mechanism. And sure enough, a whole new class of devices has arisen to meet this need: portable units similar to the classic Sony Walkman but geared for MP3 playback, rather than tape or CD, are becoming hugely popular. Meanwhile, we’re beginning to see the emergence of a whole new range of home stereo MP3 components, capable of storing gigabytes of digital audio and being operated just like any other home stereo component. Of course, the technology is being applied to car stereos as well. Even hand-held computers such as the Handspring Visor are gaining MP3 playback capabilities.

Users with some technical know-how and a soldering iron are hacking out techniques for building MP3 playback hardware of their own, free from SDMI and other security mechanisms (see Chapter 7 for more about MP3 security and legal issues) that ultimately limit the functionality of commercial MP3 hardware. Chapter 6, includes comparative analysis of MP3 portables, an early look at a few MP3 home stereo components, and introduces the concepts of building your own MP3 hardware from scratch.

About the Codec

So, what exactly is MPEG audio compression, and MP3 specifically? Technically, that’s a bit of a long story, so we’ll go into great detail on that in Chapter 2. You don’t need to know how MP3 works in order to start playing with it, but to shed a little light on the subject now, MPEG audio compression is a “psychoacoustic” technique that exploits various limitations in both the human ear and the mind’s ability to process certain kinds of sounds at very high resolutions. MPEG encoders store “maps” of human auditory perception in a table, and compare an incoming bitstream to those maps. The person doing the encoding gets to specify how many bits per second will be allocated to storing the final product. Taking note of that restriction, the encoder does its best to strip away as much data as possible (within the specified data storage limitation, or “bitrate”) while still retaining the maximum possible audio quality. The more bits per second the user allows, the better described the final output will be, and the larger the resulting file. With fewer bits per second, the user will get a smaller file (better compression), and a corresponding decrease in audio quality. Again, we’ll go into the process in greater detail in Chapter 2.

The MPEG family

MPEG is not a single standard, but rather a “family” of standards defined by the Moving Picture Experts Group, which was formed in 1998 to arrive at a single compression format for digital audio and avoid a standards war between various competing technologies. All of the MPEG standards are used for the coding of audio-visual data into compressed formats.

Note

Coding in this sense of the word refers to the process of running a stream of bits through an algorithm, or set of rules. Encoding is the process of taking an uncompressed bitstream and running it through the algorithm to generate a compressed bitstream or file. Decoding is, naturally, the opposite—taking a compressed bitstream and turning it into an uncompressed file or an audible signal. The term codec is short for compressor/decompressor,[1] and refers to any algorithm capable of performing this bidirectional function.

The MPEG family is broken down into major classes (MPEG-1, MPEG-2, MPEG-4), which are further broken down into sub-classifications called layers . Each major class and layer is optimized for specific real-world applications, such as compressed movie soundtracks, broadcast, or file-based musical coding. Each successive layer is more complex than the preceding layer. For example, a layer III decoder will be 2.5 times more complex than a layer I decoder. The MPEG “layers” are described in sub-documents of each class, with audio coding schemes described in a document labeled “ISO/IEC11172-3.” The MPEG coding technique that interests us in this book is MPEG-1/MPEG-2 Layer III, referred to throughout this book simply as “MP3.”

Note

Technically, MPEG-1 Layer III and MPEG-2 Layer III are both referred to as MP3, as are the rather obscure MPEG 2.5 extensions. MPEG-1 Layer III is used for 32, 44.1, and 48kHz sampling rates, while MPEG-2 Layer III is for 16, 22.05, and 24kHz sampling rates. The MPEG 2.5 extensions allow for 8 and 11kHz. MP3 players can play any of these, and the specs are very similar.[2] The vast majority of files you’ll encounter in the wild are simple MPEG-1 Layer III.

Do not confuse MPEG-1 Layer III (MP3) with MPEG-3—there is no such animal. There was once an MPEG-3 classification in development, which was intended to address high-quality video. However, MPEG-2 was shown to deliver sufficiently high quality, so MPEG-3 was conjoined with the existing MPEG-2 specification. The spec now skips from MPEG-2 to MPEG-4.

The MP3 patent

The fact that the MP3 spec is maintained by the MPEG Working Group doesn’t mean they invented the technology. The working group merely codifies standards to guarantee interoperability between various applications, operating systems, and implementations. One of the very first tasks of the working group was to circumscribe the conditions of the ownership of intellectual property under the umbrella of international standards. Their conclusion was that patented technologies are allowed to be codified as standards, but that those patents must be fairly and equitably licensable to all comers, so that no single company could gain a monopoly on a specific audio/video compression technology.

The MP3 codec itself was devised by the Fraunhofer Institute of Germany and Thomson Multimedia SA of France (referred to throughout this book simply as “Fraunhofer”), who originally published the standard in 1993.[3] Fraunhofer and Co. own the intellectual copyright on any technology capable of creating “an MP3-compliant bitstream.” While Fraunhofer publishes low-grade sample code that can be used as a basis for more sophisticated MP3 coding tools, Fraunhofer still requires developers of MP3 encoders to pay hefty licensing fees (full details on that can be found in Chapter 5).

To learn more about the MPEG working group and MPEG specifications in general, there is no better starting point than http://www.mpeg.org. To learn more about Fraunhofer and MP3 licensing issues, see http://www.iis.fhg.de. The official web site of the MPEG Consortium is http://drogo.cselt.stet.it/mpeg/.



[1] In some circles, the term stands for enCOder/DECoder, though this interpretation has lost favor to compressor/decompressor.

[2] MPEG-2 also allows for multichannel extensions of up to five channels, though few people have ever actually seen this in action. Multichannel efforts are concentrated on MPEG-4, covered in Chapter 9.

[3] Fraunhofer did not work alone; other companies and organizations (notably AT&T) contributed to the development of the encoder as well.

Get MP3: The Definitive Guide now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.