Not too long ago, the Computer Game Developers Conference (GDC) was a shockingly small affair. Audio vendors like Roland, Mediavision, and Creative Labs were using fold-up tables with a banner. In 2008, videogames are a $19 billion industry, and last month's show took up the entire North and West halls of San Francisco's Moscone Center. More than 18,000 game designers, graphic artists, business people, and — most interesting for us — composers streamed in to learn about the latest techniques for creating interactive entertainment. And more than a dozen companies were showing off new audio technologies.
As hardware has advanced, developers and players alike are taking game audio more seriously. The sawtooth melodies and white-noise percussion of the past have been replaced by symphonic scores recorded by large orchestras — or sometimes by individual composer/performers with "laptop symphonies." Let's hear what the experts at GDC 2008 had to say about the state of audio in games, and then dive into some of the technology advancements that are changing the interactive audio landscape.
In my role as chairman of the Interactive Audio Special Interest Group (IASIG), I'm fortunate to have access to some of the best and brightest in game audio. IASIG started at GDC 13 years ago to solve problems that face game-audio professionals, and each year we return to collect more ideas and then launch working groups to implement them. The reverb design in the Xbox, for example, was the result of an IASIG working group's efforts.
But first, what is "interactive" audio? Put simply, unlike most music we listen to, interactive audio almost never sounds the same twice. The soundtrack — music and effects — changes based on the player's behavior in the game. That can go a long way toward keeping the experience fresh and exciting over the 40 hours or so a player might spend with a typical game.
Composer DS Wallace told me he sees game scores becoming much more artistic. "One positive change taking place is that in the past, programmers were the ones who controlled the music and audio," he explained. "It's not really their fault, but games can suffer without an artistic hand managing the music. Today, games like BioShock are breaking this cycle by really doing things outside of the box and having the music interact more with the player's actions."
Another well-known audio professional, Scott Gershin of Soundelux, spoke about raising the bar in sound design. He emphasized how important it is to understand timing, and how the rhythm and pace of scenes and the story affect the experience in its entirety. He said that a sound designer needs to provide contrast in sound just as one would do in picture by mixing high and low frequencies at the right time. "When a train goes by, I want to hear the subwoofers fire," he insisted.
Marty O'Donnell, who composed the music for the Halo games, is about as well known as it gets in this industry. During a GDC session called "Composer Tips and Tricks for Creating Interactive Music," O'Donnell urged composers to think like a composer and write the music first, and then think like a music editor while integrating the music into the game. He also offered a helpful tip for those wanting to analyze what he did in the Halo 3 soundtrack. By using the Xbox's ability to capture a "film" of a gameplay session, you can replay the same scene several times and compare how the music changes. In the live demo he played, the music sounded totally natural, stretching and flowing to enhance the character's actions.
It's important to note that the views of each of these professionals (and others) at GDC relate. They all point to the importance of having, or developing, both a micro sense and macro sense of what's happening in the game and adjusting to meet those needs. But while aesthetics are more important than ever, it's the steady advance in technology that enables composers to produce the lavish, modern soundtracks in today's games. Here are some of the audio advancements I saw this year at GDC.
The most striking realization for me was how far we've come. The first Sound Blaster with surround sound took off around 1998; then A3D and EAX added environmental effects; and ten years later, the big news is simply the continual refinement in quality rather than flashy new features. I did see some real innovations, but they were more like icing on the cake for the well-established, surround-capable games we enjoy today.
For those unfamiliar with the terminology, here's a quick roundup:
EAX — Technology developed by Creative Labs and used across the industry for creating acoustic environments using reverb. With EAX, sound designers have control over the reflective properties of sounds generated in game. For example, when you walk into a room and a monster makes a noise in the next room over, EAX properties can handle the occlusion and reflections of sound as it travels through virtual doorways, halls, and sewer pipes.
3D Positional Audio — The ability to define the player as a "listener," and all of the sounds in a game as "emitters" that revolve around the listener as the point of view changes. This is what makes surround sound interactive in games.
VOIP — Voice Over Internet Protocol systems give the player the ability to speak into a headset during gameplay and have other players hear him. This allows for a great level of cooperation among players that isn't possible without voice technology.
API — An API (Application Programming Interface) allows programmers to take advantage of special hardware features such as a vibration motor in a game controller. Each console manufacturer, including Sony, Nintendo, and Microsoft, has its own proprietary API for audio.
Middleware — An API that resides between the game and the console manufacturer's proprietary API. Writing to middleware allows sound designers, musicians, and programmers to create their audio once and have it work everywhere instead of writing for three or four different game platforms in parallel. In short, middleware can save a lot of work.
One of the biggest names in audio-development tools is Firelight Technologies' FMOD (www.fmod.org). FMOD's big announcement at GDC was that its API now has the ability to do adaptive (i.e., interactive) audio and music. This is something we've already seen from Microsoft in DirectMusic, and in several proprietary APIs within companies like EA and LucasArts. However, this is probably the easiest implementation I have seen, and it helps make FMOD more robust.
Creative Labs (www.creative.com) remains a giant in game audio, but unlike the days when a trip through the Creative booth could take half an hour or more, this year one could be in and out in less than five minutes. Creative continues to push OpenAL, which seems to be the dominant open-source audio API. Creative's technology sounds as good as it gets, and remains my favorite sound hardware for gaming. Unfortunately, nothing Creative showed at GDC compared with its launch of EAX and subsequent full backing of OpenAL.
Vivox (www.vivox.com) showed a new voice technology called Voice Fonts, with an amazing pop-down menu for selecting different modulations to one's voice. For example, if you wanted other players to hear your voice as if it were an Orc, simply choose from a drop-down menu and everything you say sounds like you just hatched from a mud pit in Mordor. VOIP systems have been around for a while and in-game chat is nothing new, but being able to alter your voice to sound like a character in the game adds an incredible level of realism. The Vivox VisionStudio API is what makes all of this possible.
Founder and VP Monty Sharma gave me a detailed demonstration by turning my voice into an elf, and something akin to a pro wrestler, with the flick of a mouse. Luckily for me, they haven't integrated this marvelous voice technology into World of Warcraft yet, as it is so much fun, I'd spend most of my waking hours rediscovering that game with an all-new level of audio interaction.
Audiokinetic (www.audiokinetic.com) showed that convergence is underway between home theater and gaming by introducing Wwise Motion. This middleware tool bridges games made with their Wwise development platform to the D-Box home-theater motion seating. The D-Box is a chair mounted on pistons that move it in sync with the action on screen. You may have experienced the vibrating seats in Imax theaters, powered by Buttkicker sonic transducers. Well, D-Box kicks the Buttkicker's butt. Movie studios now producing Blu-Ray discs with D-Box motion codes, but with Audiokinetic's new tools, games can kick your D-Box chair in the butt as well. That certainly gives people one more reason to consider one, and advances the impact of audio in games another notch.
THX (www.THX.com) showcased its new partnership with Neural Audio, bringing full-blown, THX-certified, 7.1-channel surround sound to gaming. Although it sounded fantastic, the number of gamers, let alone home theaters, with 7.1 surround is still very small. If you've got the coin, THX continues to deliver the goods. Of course, you'll also need a Neural Audio-equipped AV receiver to decode the 7.1 surround, so those with 5.1 systems right now really have to want those extra two speakers.
Finally, the announcement I'm proudest of: the IASIG released the Interactive Extensible Music Format (iXMF) specification for public review, and Sony said it plans to adopt the new format in its forthcoming PlayStation development tools. This new format, which combines audio, MIDI, and scripting in a single file, was the work of dozens of industry professionals over five years. It's designed to address a number of goals interactive composers and sound designers have, including:
As the world's first open game audio specification, iXMF should have a good shot at uniting the field and moving it forward. I encourage you to read the draft document (247KB PDF) and add your thoughts.
A few years ago at GDC, Xbox audio mastermind Brian Schmidt spoke before a packed room and announced that game audio was "basically done." Technologies have brought us to the point of diminishing returns on many aspects of sound generation, he said; all necessary basics have been covered. Although Brian's goal was to shake up the audience, in some senses he was right. We have discrete and virtual surround-sound options, multiple reverbs, full-range fidelity, digital pipelines, and more. And as you can see by the new arrivals at GDC 2008, we also have more subtle enhancements that collectively offer dramatic improvements to audio in games.
But as Brian concluded, there is plenty yet to be done:
In the meantime, if all of my favorite games end up using Vivox VisionStudio, OpenAL, THX/Neural 7.1, iXMF, adaptive music, and a D-Box chair, I will have some serious audio upgrades to do.