Chapter 1. HTML5 Audio and Video Elements: By-Default

The media elements, as the HTML5 audio and video elements are generically termed, are a way of embedding playable media files directly into a web page without having to use Flash or a plug-in. The elements can be styled with CSS, integrated with SVG and Canvas, and controlled with JavaScript.

Browsers and other user agents that implement the HTML5 media elements also provide default controls and behavior for each. In this chapter, I cover how to add HTML5 video and audio elements to your web page, and explore some of the implementation differences among the browsers. I also cover the more widely supported media file codecs and containers, and browser support for each.

Support for the media elements is relatively broad, though not all features of the media elements are supported in all browsers. Table 1-1 provides a listing of popular browsers and mobile environments, and the version of each that provides at least a minimum of support for the media elements.

Table 1-1. Support for HTML5 audio and video, by popular browser and mobile OS

User Agent

Version

Internet Explorer

9+

Google Chrome

3+

Firefox

3.5+

Opera

10.5+

Opera Mini

11+

Safari

3.1+

iOS

3.0+

Android OS

2.0+

Adding a Media Element to a Web Page

The HTML5 media elements share a common syntax and subgroup of attributes. The only difference between the two elements is the content they manage, and a small group of additional attributes for the video element.

Minimal Element Syntax

The simplest syntax to add a media element to the web page is demonstrated in Example 1-1. In the HTML, an audio element is used to embed an audio file encoded as Ogg Vorbis into the web page. The URL for the audio file is given in the audio element’s src attribute. The element’s style and behavior will be the default defined in the HTML5 specification and implemented by the browser.

Example 1-1. HTML5 web page with embedded audio file using an audio element
<!DOCTYPE html>
<head>
   <title>Audio</title>
   <meta charset="utf-8" />
</head>
<body>
   <audio src="audiofile.ogg">
   </audio>
</body>

The page validates as proper HTML5, and Firefox, Chrome, and Opera all support the file type. When you load the page in these browsers, you don’t get an error. However, when you look at the page, you won’t see anything.

Compare Example 1-1 with the following:

<!DOCTYPE html>
<head>
   <title>Video</title>
   <meta charset="utf-8" />
</head>
<body>
   <video src="videofile.ogv">
   </video>
</body>

Unlike the audio element, the video element has a play area that should show as long as there’s no error loading the video, and the video element isn’t deliberately hidden. If you want to actually see the audio file in the page, you need to add the controls attribute. Since controls is a boolean attribute, all you need do is add the attribute word:

<audio src="meadow.ogg" controls>
</audio>

Note

A boolean attribute is one where a value doesn’t need to be assigned to the attribute: its very presence implies a true value, while the lack of the attribute implies a default false value. However, boolean attributes must be assigned a value if you’re serving your page up as XHTML, or you’ll get a page error. The standard approach for XHTML5 is to assign the attribute a value equal to the attribute name, contained within quotes and without any extraneous white space (controls="controls").

Figure 1-1 shows the audio element in Firefox after the controls attribute has been added. The control is rather plain, but it does the job. You now know an audio file has been added to the page, and you can start and stop the audio file, change the volume, and watch its progress as it plays.

Audio element with default control in Firefox 4
Figure 1-1. Audio element with default control in Firefox 4

Disabled Scripting and the Magically Appearing Controls UI

Both the video and audio elements support the controls attribute for adding a default control UI (user interface) for the media resource. If you don’t want the default control UI, leave the attribute off. Note, however, that something interesting happens with the control UI when scripting is disabled: in at least one browser, the control UI is added to the media element, whether you want it or not.

Web developers wanting to provide custom controls remove the controls attribute so that the default control doesn’t conflict with the custom control. The developer typically adds the controls attribute to the video or audio element, and then removes it using script as soon as the media element is loaded. This form of progressive enhancement ensures that if scripting is disabled, the user can still play the media resource.

However, sometimes people deliberately leave the controls attribute off the media element because they’re using the media element as part of a web page presentation and want the media to play as soon as the page loads—regardless of whether scripting is enabled or not. They’ll remove the controls attribute, and add autoplay and possibly the loop attribute (covered later in the chapter). If scripting is enabled, the default media control isn’t added to the page—but if scripting is disabled in the user’s browser, according to the HTML5 specification, the browser is then supposed to add the control, by default.

This is an unusual event without precedent in web development. It’s comparable to the browser overriding CSS to display hidden or collapsed fields if scripting is disabled, regardless of what the developer or author wants.

Currently, Opera is the only browser that actually provides a visible control if scripting is disabled. The other browsers are technically in violation of the HTML5 specification, though I couldn’t find bugs for any of the browsers asking for this behavior. There are, however, bugs filed against the HTML5 specification to remove this unusual fallback feature. Since we don’t know if the bugs will result in a change to the specification or not, you’ll want to test your use of the HTML media elements with scripting enabled and disabled, regardless of whether you use scripting in your page or not.

Warning

Another browser foible: if scripting is disabled, Firefox doesn’t currently display a control UI (User Interface) even if you do provide the controls attribute. You’ll need to use the right mouse button context menu to control the media. More on this in Chapter .

Support for Multiple Media File Types

Figure 1-1 showed Example 1-1 in Firefox, using the default control UI that Firefox provides. You’re probably curious to see what the default styling is for the audio element control in another browsers, such as Internet Explorer 9.x. If you open the page in IE 9, though, all you’ll get is a black box with a small red x signaling broken content.

The reason you received an indication of broken content is because the audio element only features one type of audio content—an audio file encoded as Ogg Vorbis. Microsoft does not support Ogg Vorbis in Internet Explorer.

Note

You can play Ogg Vorbis files in IE 9 if you install supporting software. I’ll cover this in more detail in the next section.

Testing the page with all our target browsers, we find that the audio file works with Chrome, Opera, and Firefox, but not with Internet Explorer or Safari. In IE, the element appears broken, while in Safari the control appears but nothing happens when you hit the play button.

We’ll get into the various audio and video codecs and browser support in the next section, but for now, let’s see what we can do to ensure media files work with all of our target browsers. This time, though, we’ll add a video element to the page.

In Example 1-2, the web page contains a video element, but rather than provide the location of the video file in the element’s src attribute, three different video files are defined in three different source child elements. The location for each of the video files is given in the source element’s src attribute.

Example 1-2. HTML5 web page with embedded video element with three separate video types
<!DOCTYPE html>
<head>
   <title>Video</title>
   <meta charset="utf-8" />
</head>
<body>
   <video id="meadow" controls>
      <source src="videofile.mp4" />
      <source src="videofile.ogv" />
      <source src="videofile.webm" />
   </video>
</body>

Both the video and audio elements can contain zero or more source elements. These child elements define a way to specify more than one audio or video file in different formats. If a browser doesn’t support one format, hopefully it will find a format it supports in another source element.

Warning

If you use the src attribute on the audio or video element, any contained source elements are ignored. Using both also generates a HTML5 validator conformance error. Use one or the other, but not both.

What happens if the browser or user agent does not find a video or audio file it supports? Both of the media elements do allow other HTML within their opening and closing tags, so can this other HTML be used as fallback content?

Unfortunately, the answer is “no”. You can include other content in the media elements, but that content is only for browsers and other user agents that don’t support either the audio or video elements. For instance, if you open a web page containing the HTML shown in Example 1-3 in an older browser, such as IE 8 (or IE 9 running in Compatibility View), the YouTube video is shown rather than the embedded video.

Example 1-3. HTML5 web page with embedded video element with three separate video types and fallback content for user agents that don’t support HTML5 video
<!DOCTYPE html>
<head>
   <title>Big Buck Bunny Movie</title>
   <meta charset="utf-8" />
</head>
<body>
   <video controls>
      <source src="videofile.mp4" />
      <source src="videofile.ogv"  />
      <source src="videofile.webm" />
      <iframe width="640" height="390"
         src="http://www.youtube.com/embed/YE7VzlLtp-4">
      </iframe>
   </video>
</body>

Older browsers, such as IE 7 and IE 8, get the YouTube video, which ensures that web page readers using these older browsers have access to the material. However, if you remove the Ogg Vorbis and WebM source elements and open the page in Firefox, all you’ll get is a square gray box with a lighter gray X because Firefox can’t find a video source it can play. You won’t get the YouTube video.

The only way to ensure that a video plays in all of the target browsers and other user agents is to provide all the appropriate video types.

Note

Before getting into the codecs, it’s important to know that you can use video files with an audio element, and audio files with a video element. The only difference between the two is the video element provides a playing area. All browsers support video files with the audio element, but only Opera and Firefox currently support audio files playing in the video element. I strongly recommend using the appropriate element.

The Audio and Video File Babble and the Source Element in Detail

When talking about media file types, we’re really talking about two separate components: the software used to encode and decode the audio or video stream (the codec, which is short for compressor-decompressor or coder-decoder), and the container, which is a wrapper format that contains the media streams and information about how the data and metadata co-exist. An example of a container is the open source Ogg (from Xiph.Org), while an example of a codec is VP8, a lossy video compression format from On2 (and Google). Technically, a codec could be used with many different containers, and containers could wrap many different codecs, but we tend to think of pairs of container/codecs when talking about browser support.

Audio files are containers wrapping one type of media data, the audio stream, but video files typically wrap two different media streams: the video and the audio data streams. In addition, containers can also support subtitles and captions, as well as the information to keep all data tracks in sync.

Note

Though you can embed subtitles directly into the file with some containers, HTML5 video provides a means of incorporating external subtitle files. More on incorporating subtitles and other accessible features in Chapter .

HTML5 Audio Codecs/Containers and Lossless versus Lossy Compression

Just like with image containers, such as JPEG and PNG, audio and video codecs can either be lossless or lossy. A lossless video or audio codec preserves all of the original media file’s data when it’s compressed. Lossy compression techniques, however, lose data each time the data is encoded.

Though most of us have the bandwidth to download lossless images such as PNGs, lossless video is beyond even the most generous of broadband capacity, so the only codecs supported for HTML5 video are lossy codecs. Audio, however, is different. The audio element supports uncompressed audio files, as well as audio files with both lossless and lossy codecs.

WAV Audio Format

One of the older and more familiar audio file formats is the Waveform Audio File Format (WAVE), commonly known as WAV for the extension the audio files are given (.wav). Though WAV files can support compression, most WAV files contain audio in an uncompressed Pulse-Code Modulation (PCM) representation, which means the files tend to be quite large.

Safari, Chrome, Firefox, and Opera support uncompressed WAV files. However, the size of the WAV files preclude their being a popular HTML5 audio file format.

MP3

Another well known and common audio file format is the MPEG-1 Audio Layer 3, commonly known as MP3 because of the extension given MP3 files (.mp3). It is neither a container or codec, as we know these things. Instead, it’s an all-in-one lossy compressed audio file with metadata strategically inserted.

At this time, the only audio format that Microsoft supports in IE9 and up, by default, is MP3. In addition, the format is also supported by Safari and Chrome. However, Firefox and Opera refused to support MP3s right from the start, because of patent issues and royalty requirements.

MP3 is supported in most operating system environments, and MP3 files are a popular fallback when linked into the page. Though the file won’t play natively in the browser, clicking the link will trigger some media player in most environments:

   <audio id="background" autoplay loop>
      <source src="audiofile.mp3" type="audio/mpeg" />
      <source src="audiofile.ogg" type="audio/ogg" />
      <p><a href="audiofile.mp3">Your audio file fallback</a></p>
   </audio>

Note

Safari requires the installation of QuickTime and supports whatever media types QuickTime natively supports in the system. Since QuickTime supports MP3 and WAV, Safari supports MP3 and WAV.

Ogg Vorbis

When the media elements were first added to HTML5, the specification included a requirement that all user agents support the Ogg open source container. The Ogg container was developed by the Xiph.Org foundation, which also developed an associated audio codec, called Vorbis. The Vorbis codec is a lossy compression technique that is free for everyone to use and is, according to the folks at Xiph.Org, free of patents (to the best of their determination). The hope at the time the media elements were first defined was that this tower of babble that we have for audio and video could be avoided by ensuring support for one container and one codec, neither of which are encumbered by patents or royalty requirements.

Note

Find out more about the Ogg Vorbis container/codec at the official support site at http://www.vorbis.com/.

Apple and other companies, though, objected to the Ogg Vorbis requirement because of lack of hardware support, their belief that the Vorbis codec was inferior to other codecs, and concerns of potentially hidden patents (known as submarine patents) related to the codec.

Though the Xiph.Org foundation has done their best to search among patents to ensure Vorbis is patent free, there’s no way to guarantee that unless it is challenged in a court of law. It’s a catch-22 situation without any viable solution, so the section in the specification that required support for Ogg Vorbis was removed.

Note

For an interesting historical perspective, the email from Ian Hickson, HTML5 editor, about dropping support for both Ogg Vorbis and Ogg Theara can be found online at http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2009-June/020620.html.

Though Ogg Vorbis is no longer a requirement, several browsers do support it. Firefox, Opera, and Chrome support Ogg Vorbis, while Safari and IE do not.

The AAC Codec

The Advanced Audio Coding (AAC) lossy compression codec was originally considered to be a successor to MP3, though it didn’t get broad acceptance. It languished, little known, until Apple picked it as the format for the files in its iTunes store. The container it’s most used with is the MPEG-4 Part 14 container, known as MP4 for the .mp4 file extension. Though most of us assume that MP4 files are video, they can be audio only. In fact, another common file extension used with MP4 audio files is .m4a, again primarily because of Apple’s influence. Safari, Chrome, and IE support MPEG-4 AAC.

WebM Audio

WebM is a container based on the profile for the Matroska Multimedia Container. WebM was designed from the beginning to be patent and royalty free. Google was instrumental in forming the organization behind WebM, but has given up any and all patent claims to the container.

Codec support in WebM is quite simple: WebM only supports Vorbis for audio, and VP8 for video (which I’ll cover in the next section). The reasons for such simple codec support are given in a FAQ at the WebM web site:

We decided to define WebM files in this way because we wanted to do what’s best for users. Users just want video to work, they don’t want to worry about supported codecs, file formats, and so on. After much discussion with browser makers, tool developers and others, we reached a consensus that a narrowly defined format would cause the least confusion for users. If a user has a .webm file, he or she can be confident that it will play in any browser or media player that supports WebM.

WebM is supported by Chrome, Firefox, and Opera. It is not currently supported by IE and Safari. However, people can ensure that WebM files work in their IE9 browser by installing the WebM plug-in for IE9 (found at http://tools.google.com/dlpage/webmmf). However, since we as page authors, designers, and developers can’t be sure that the WebM plug-in is installed, we have to provide support for browsers that currently don’t support WebM.

Note

People typically think that WebM is solely a video file format. However, you can create a WebM file that consists of only one Vorbis data stream, and it works in an audio element. The source element’s type setting is audio/webm. Find out more about WebM at the project website, at http://www.webmproject.org/.

I’ve covered the popular audio and video file types, but how do browsers know if an MP4 is an audio file, or a video file? Of if this file is an Ogg, and that is a WebM? Well, they can open the file and see for themselves. Or we can provide the information directly in the media element.

Providing Codec information in the type attribute

Earlier, I stated that an MP4 file can be audio or video. So how does the browser or other application know which type of file it is? Or what codec is being used in the MP4 container? In fact, what codec is used in any of the containers?

One approach is to use a popular and unique file extension, such as.m4a, and then add a MIME type to your web server for the extension. You can add the MIME type directly to the mime.types file for an Apache server, or you can add a MIME type to the directory’s .htaccess file (assuming you’re running Linux):

AddType audio/mp4 m4a
AddType video/ogg ogg oga
AddType video/webm webm

You should also use the type attribute in the source element. The type attribute provides information to the browsers and other user agents about the container and codec, as well as type of file listed in the src attribute. The syntax for the type attribute is the type of file, followed by the type of container. In Example 1-4, the container and media file type for each audio file is added to each of four source elements.

Example 1-4. HTML5 web page with embedded audio element with four separate audio types, each with their specific MIME type provided in the source element’s type attribute
<!DOCTYPE html>
<head>
<title>Audio</title>
<meta charset="utf-8" />

</head>
<body>
<audio controls>
   <source src="audiofile.mp3" type="audio/mpeg" />
   <source src="audiofile.ogg" type="audio/ogg" />
   <source src="audiofile.wav" type="audio/wav" />
</audio>
</body>

Of course the last file, the WAV, never gets played, at least not with our target browsers. IE, Chrome, and Safari pick up the MP3, while Firefox and Opera pick up the Ogg file. Each browser traverses the source elements until if finds a file it can play, and then stops. Minimally, you can provide an MP3 and an Ogg or WebM audio file, which covers all five target browsers, in addition to iOS and Android.

Table 1-2 contains a summary of the different audio codecs and containers covered, as well as modern browser support, common file extension(s), and type setting.

Table 1-2. Audio container/codec support across popular modern browser versions

Container/Codec

Type

Extension(s)

IE9+

Firefox

Safari 5+

Chrome

Opera 11+

WAV

audio/wav or audio/wave

.wav

No

Yes

Yes

Yes

Yes

MP3

audio/mpeg

.mp3

Yes

No

Yes

Yes

No

Ogg Vorbis

audio/ogg

.ogg, .oga

No

Yes

No

Yes

Yes

MPEG-4 AAC

audio/mp4

.m4a

Yes

No

Yes

Yes

No

WebM Vorbis

audio/webm

.webm

No

Yes

No

Yes

Yes

HTML5 Video Element Codecs/Containers

As I mentioned in the last section, video files are far too large to serve up in anything other than a lossy compressed format. As with audio codecs, no one video codec works in all browsers.

H.264

One of the most popular lossy video codecs is MPEG-4 Part 10, commonly known as H.264. H.264 is a high quality, popular format that’s common on the Internet and supported in YouTube and iTunes. It is also one of the three mandatory codecs supported by Blu-Ray players. It’s a mature codec, first standardized by the MPEG group in 2003. H.264 is also a controversial choice because of the patents held on the codec by the organization, MPEG-LA. Though video files encoded in H.264 that are distributed without cost aren’t subject to royalties, tools that encode or decode H.264 do have to pay royalties. The cost for these royalties is usually passed on to the tool buyer.

The H.264 video codec is combined with either the AAC or MP3 audio codec in an MPEG-4 container. This combination is typically known as MP4, and files are usually given an .mp4 extension . You’ll also see files with .m4v extensions for H.264. Apple iTunes uses the .m4v extension with its videos, but they’re also encumbered by DRM and won’t play in HTML5 video elements.

The H.264 codec is the only video codec that Microsoft supports for IE. It’s also supported by Safari. Chrome has dropped support for H.264, and Firefox and Opera have never supported it because of the patent issues.

Ogg Theora

Firefox, Opera, and Chrome do support another codec, Theora, from the same organization (Xiph.Org) that provided the Ogg container and Vorbis audio codec described in the last section. The Ogg Theora container/codec was originally the mandatory codec and container for video elements in HTML5 until Apple and other companies objected to the restriction. Neither IE nor Safari support Ogg Theora, though there are plug-ins that can be installed to provide support in both browsers.

Note

Xiph.Org provides a plug-in that enables support for the Theora and Vorbis codecs in QuickTime, which indirectly enables support for Safari. Access the plug-in at http://www.xiph.org/quicktime/about.html. Another plug-in, OpenCodecs, provides more generalized support for Ogg Vorbis, Ogg Theora, WebM, and various other Ogg container/codec pairings, and can be accessed at http://xiph.org/dshow/downloads/.

WebM

The last video container I’ll cover is WebM, which I introduced in the section on audio codecs. Unlike many of the other containers, WebM supports only one audio codec, Vorbis, and one video codec, VP8. VP8 was created by a company named On2, which was later bought by Google—who promptly open-sourced the VP8 codec.

WebM is supported by Chrome, Firefox, and Opera. There is no built-in support for WebM in Safari and IE, but, as mentioned earlier, there is plug-in support for WebM for both browsers.

Since Chrome, Firefox, and Opera support both Ogg Theora and WebM, which should you use? The answer is: it depends.

Both should continue to be supported for the foreseeable future. The Open Source community, including Wikipedia, still primarily support Ogg Theora, but since Google open-sourced VP8, this may change in the future. VP8 is generally considered a better codec than Theora, but I’ve never seen much difference in quality when it comes to videos sized and optimized for the web. But then, I’m not a picky videophile, either.

Ensuring Complete Video Codec Support

You can’t assume your web page readers have plug-ins installed to play Ogg or WebM videos in Safari or IE. In order to ensure that a video is accessible by all of the target browsers, you’ll need to provide, at minimum, two different source elements for your video element. Example 1-5 shows an HTML5 web page with a video element containing two different video sources: one in H.264 (necessary for IE and Safari), and one in WebM (for Firefox, Opera, and Chrome). In addition, if you want to ensure that non-HTML5 compliant browsers have access to the video, you’ll also need to provide some form of fallback. In the example below, the fallback is a YouTube video. Another choice can be Flash or another plug-in.

Example 1-5. HTML5 web page with video that works in all target browsers
<!DOCTYPE html>
<head>
   <title>Big Buck Bunny Video</title>
   <meta http-equiv="Content-Type" content="text/html;charset=utf-8" />
</head>
<body>
   <video controls>
      <source src="videofile.mp4" type="video/mp4" />
      <source src="videofile.webm" type="video/webm" />
      <iframe width="640" height="390"
         src="http://www.youtube.com/embed/YE7VzlLtp-4">
      </iframe>
   </video>
</body>

Figure 1-2 shows the video playing in Chrome, which supports the embedded HTML5 video. Figure 1-3 shows a YouTube video playing in IE9 with compatibility mode turned on, emulating an older version of IE that doesn’t support HTML5 video.

Note

If you’re unsure how to structure the fallback content in the video element to ensure access to the video in all user agents and browsers, I recommend reading Kroc Camen’s article, “Video for Everybody”, at http://camendesign.com/code/video_for_everybody.

Table 1-3 provides a summary of coverage of the three video codecs I covered in this section.

Table 1-3. Video container/codec support across popular modern browser versions

Container/Codec

Type

Extension(s)

IE9+

Firefox

Safari 5+

Chrome

Opera 11+

MP4+H.264+AAC

video/mp4

.mp4, .m4v

Yes

No

Yes

No

No

Ogg+Theora+Vorbis

video/ogg

.ogg, .ogv

No

Yes

No

Yes

Yes

Video playing in Chrome with support for HTML5 video
Figure 1-2. Video playing in Chrome with support for HTML5 video
Video playing in IE9 in compatibility mode, triggering the YouTube fallback
Figure 1-3. Video playing in IE9 in compatibility mode, triggering the YouTube fallback

Before I return to the media elements, there is one more attribute supported on the source element, though it is rarely used: the media attribute. The media attribute provides the intended media source for the element. The default value is all, which means the media file is intended for all media sources. There are several other allowable values, but the two that make most sense (other than all) for media elements, especially video, are handheld and screen. In combination with media queries in CSS, one can have a web page serve both desktop and handheld devices.

However, I’m not overly fond of trying to get one page to work in two completely different environments. Many sites have a mobile only version of the content, usually designated with a m subdirectory setting (such as http://m.burningbird.net). In the end, it may be simpler just to provide the separately formatted sites—especially with Content Management Systems (CMS) that serve all of the contents from a database.

Note

Drupal, the CMS I use, has a custom module named Domain Access (available at http://drupal.org/project/domain), that allows us to designate a different theme, page and content structure for pages served up with the m subdomain. Most other CMS tools offer something comparable.

The Media Elements in More Detail

After that refreshingly simple jaunt through containers and codecs, we’ll return to looking at the media elements in more detail.

Media Elements and Global Attributes

The audio and video elements both support the same set of global attributes:

accesskey

A unique, ordered, and space separated (as well as case sensitive) set of tokens that enables specifically named keyboard key access to the media element.

class

A set of space separated tokens specifying the various classes the element belongs to.

contenteditable

If true, content can be edited; if false, content cannot be edited.

contextmenu

The id of the context menu associated with the element.

dir

The directionality of the element’s text.

draggable

Whether the media element can be dragged. If true, the element can be dragged; if false, the element cannot be dragged.

dropzone

What happens when an item is dropped on the element.

hidden

A boolean attribute that determines if the element is “relevant”. Elements that are hidden are not rendered.

id

A unique identifier for the element.

lang

Specifies the primary language of the element’s contents.

spellcheck

Set to true for enabling spell and grammar checking on the element’s contents.

style

Inline CSS styling.

tabindex

Determines if media element is focusable, and the element’s order is in the tabbing sequence.

title

Advisory information, such as a tooltip.

data-*

Custom data type, such as data-myownuse or data-thisappsuse. Used to read and write custom data values for use in your own applications.

Of course, not all of the global attributes seem relevant with both of the media elements. For instance, I can’t see how it is possible to spell or grammar check the contents of a video. Others, though, are very useful.

If you need to access a specific audio or video element using JavaScript, you’ll need to set its id attribute. You can capture an entire set of media elements in a page, and pinpoint the specific one you want by its page position, but it’s easier just to use id.

If you have more than one media element on possibly different web pages, and you want to provide the same CSS styling for each, you’ll need to assign a class for each, and then use the class name in your CSS stylesheet.

Being able to drag and drop media elements is a viable web action, so the draggable and dropzone attributes are useful. So are the accesskey and tabindex attributes if you want finer control over the element’s keyboard access.

The hidden attribute may not seem as viable at first. However, you could use it to remove an audio or video element from rendering, while still ensuring access to the contents for purposes that don’t depend on immediate reader access.

Media-Specific Attributes

In addition to the global attributes, there are also several media-specific attributes that are shared by both the audio and video elements. We’ve seen the src and controls attributes used in previous examples. The rest are provided in the following list:

preload

The preload attribute provides hints to the user agent about preloading the media content. By hints, I mean that hopefully the user agent follows the directive, but may or may not. The acceptable values are none, which hints to hold on preloading the media until the user presses the play button (or otherwise wants the video to load); metadata, which hints to load the media’s metadata only; or auto, the default state, which hints to the user agent to go ahead and download the resource.

autoplay

The autoplay attribute is a boolean attribute whose presence signals the user agent to begin playing the media file as soon as it has loaded enough of the media file so that it can play through it without stopping. If autoplay is added to a media element, it overrides the preload setting, regardless of setting.

loop

The loop attribute resets the media file back to the beginning when finished, and continues the play.

muted

If the muted attribute is present, the media plays, but without sound. The user can turn on the sound via the media controls, if they wish.

mediagroup

The mediagroup attribute provides a way to group more than one media file together.

At the time this was written, the new mediagroup attribute had not been implemented by any browser. According to the specifications, if the attribute is provided for two or more media elements, they’ll all be managed by the same implicitly created media controller. We can assume from the documentation that if one of the media files is played, the others are kept in sync. This behavior could be very helpful in situations such as having a video of a speech in one element, and a sign language interpretation of the speech in another element, or for emulating picture-in-picture with two videos.

The muted attribute is also extremely new, and had not been implemented—as an attribute—in any browser when this was written.

The combination of loop and autoplay can be used to create a background sound for when a page is loaded. You’ll want to use this functionality sparingly, but it could be useful if you’re creating a more presentation-like website where sound is tolerated, even expected, by your web page readers. Example 1-6 demonstrates how to use these attributes with an audio element that doesn’t have a controls attribute, and is also hidden using CSS, just in case the user’s browser has scripting disabled. The sound will play as soon as the page and media are loaded, and continue to play until the user leaves the page.

Example 1-6. A repeating auto started audio file in two different formats to ensure browser coverage
<!DOCTYPE html>
<head>
   <title>Repeating Audio</title>
   <meta charset="utf-8" />
   <style>
      #background
      {
          display: none;
      }
   </style>
</head>
<body>
   <audio id="background" autoplay loop>
      <source src="audiofile.mp3" type="audio/mpeg" />
      <source src="audiofile.ogg" type="audio/ogg" />
   </audio>
</body>

The example works in IE, Opera, Chrome, and Safari. It only partially worked in Firefox at the time this was written because Firefox (5, 6, or 7) doesn’t currently support the loop attribute.

You’ll want to use display: none for the CSS style setting of the audio element, to ensure that the element doesn’t take up page space. You might be tempted to use the hidden attribute, but doing so just to hide the element is an inappropriate use of the attribute. The hidden attribute is meant to be used with material that isn’t relevant immediately, but may be at some later time.

You can use the loop and autoplay with video files, but unless the video file is quite small, or encoded to load progressively, you’re not going to get the same instant effect that you get with audio files.

Video-Only Attributes and Video Resolutions

There are a couple of attributes that are only specific to the video element.

poster

The poster attribute is a way of providing a static image to display in the video element until the web page reader plays the video.

width, height

The width and height attributes set the width and height of the video element. The control will resize to fit the video when it’s played, but if the video is larger than the control, it pushes content out of the way and can be quite distracting to web page readers. If the video is smaller than the control, it’s centered within the space.

The actual width and height of a video are directly related to the resolution of the video. If you have a Standard Definition (SD) video, you have a video that’s 480 pixels in height (480 lines). If you have an HD video, you have a video that’s 720 lines (pixels) tall, or taller. You can find the exact frame dimensions using a tool such as Handbrake (covered later in the chapter).

The poster and the width and height attributes imply that you know the size of the video. You’ll want to provide the same size poster image as the video, and you’ll want to size the control the same as a frame in the video. Providing both attributes ensures that your video presentation is smooth and polished, rather than other page content abruptly being pushed down as the video element automatically expands.

Example 1-7 shows a web page with a video element and two source elements that has the width, height, and poster attributes set.

Example 1-7. Video with the width and height set, as well as a poster image to display
<!DOCTYPE html>
<head>
   <title>Birdcage</title>
   <meta charset="utf-8" />
</head>
<body>
   <video controls width="640" height="480" poster="birdcageposter.jpg">
      <source src="birdcage.mp4" type="video/mp4" />
      <source src="birdcage.webm" type="video/webm" />
   </video>
</body>

The video controls are placed over the content, including the poster image, so place text in the poster image accordingly. In addition, Safari and IE seem to hide the poster image once the video has been fully cached, but Firefox, Opera, and Chrome will redisplay the poster image when the page is refreshed, even with the video cached, as shown in Chrome in Figure 1-4.

Video playing in Chrome with a poster image and width and height set, after video is fully cached
Figure 1-4. Video playing in Chrome with a poster image and width and height set, after video is fully cached

Regardless of how browsers handle the width, height, and poster attributes, their use increases the polished perception of the video.

Audio and Video in Mobile Devices and Media Profiles

Support for HTML5 audio and video, especially video, in mobile devices is varied and can be challenging for web page authors and designers.

Challenges of a Mobile Environment

There are known quirks about the use of the HTML5 media elements in mobile devices. For instance, Apple has been a big fan of HTML5 from the beginning, deciding against support for Flash on iOS devices in favor of HTML5 video. However, some things that work on the desktop don’t in an Apple mobile environment. As an example, using the poster attribute caused the video element to fail in iOS 3, though this problem has been fixed in iOS 4. Another interesting little quirk was iPad’s only checking the first source element, so you needed to place the MP4 video first in the list (again, since corrected).

In addition, the iOS environment has its own native application for playback control, so it ignores the controls attribute.

Then there are the issues of how to test your HTML5 media applications. Most of us can’t afford to buy half a dozen devices (some of us can’t afford to buy any) and emulators don’t really work when it comes to testing out hardware and resource limitations.

Note

A good article on the issues of mobile testing is “Testing Apps For SmartPhones and Mobile Devices (Without Buying Out the Store)” at http://www.softwarequalityconnection.com/2011/03/testing-apps-for-smartphones-and-mobile-devices-without-buying-out-the-store/.

Most importantly, the video capability itself is limited in mobile environments. There is the resolution/size issue, of course, but there are also issues with containers and codecs. Mobile devices don’t have the processing power our computers have, which means that the file sizes are larger (because of simpler compression techniques). At the same time, mobile devices have data access limitations as well as issues with storage, so larger files aren’t mobile-friendly.

There’s also the challenge associated with the sheer number of mobile operating systems, mobile browsers, and devices—especially devices.

At this time, the iOS supports H.264, and the Android OS supports H.264 and WebM (though without hardware acceleration). Since Google is making a move away from H.264, we can assume the Android OS will, eventually, drop support for H.264. Maybe. In addition, the upcoming release of Windows Phone 7 from Microsoft, codenamed “Mango”, supposedly includes support for HTML5 video. Since Windows Phone 7 is Microsoft, we have to assume it will have H.264 support. Nokia is transitioning to Windows Phone 7, but is not offering HTML5 video and audio in its next release of its built-in Symbian operating system. However, you can run Opera Mobile on Symbian/S60, and get HTML5 video and audio support. Opera supports only Ogg and WebM. Blackberry supports H.264 video, but not the HTML5 video element—you’ll have to use a link.

What we can take away from all of this is that to support mobile devices, you’ll need to provide appropriately sized video files, as well as include support for both WebM/Ogg Theora and H.264. But not just any H.264. You need to provide videos encoded with the right profile.

Media Profiles and Codec Parameters

Since H.264 was designed to meet the needs of large television sets to small mobile phones, H.264 incorporates a concept known as a profile. Each profile defines a set of optional features, balanced against the file size. The more the video relies on the hardware, the smaller the file size. H.264 supports 17 profiles, but the ones we’re interested in are baseline, main, extended, and high. As you would expect, the hardware requirements for each increases from baseline to high.

Different devices support different protocols. Microsoft supports all H.264 profiles, but Safari only supports the main profile, because that’s all QuickTime supports by default. Mobile devices, such as those running iOS and the Android OS, run the baseline profile. If your site needs to provide both mobile and larger videos, you may want to encode several versions with different H.264 videos. It’s actually simple to ensure the right encoding, because most conversion tools provide device profile presets (more on this later in the chapter).

Note

The WHATWG Wiki provides a page giving several different type codec parameters, at http://wiki.whatwg.org/wiki/Video_type_parameters.

In order to ensure that each device knows which video works best for it (without having to load the video’s metadata and extract the information), you can provide the information directly in the source element’s type attribute. An example of the syntax to use is the following, for an Ogg Theora video file:

<source src='videofile.ogg' type='video/ogg; codecs="theora, vorbis"' />

The syntax is container first, then the codecs in video, audio order.

The codec specification for WebM is as simple as Ogg, but the same cannot be said for H.264 because of all of the profile possibilities. The audio codec is low-level AAC (mp4a.40.2), but the video codec is profile and level based. From the WHATWG Wiki that collects the type parameters, the video codec for H.264 can be any one of the five following codecs:

  • H.264 Baseline: avc1.42E0xx, where xx is the AVC level

  • H.264 Main: avc1.4D40xx, where xx is the AVC level

  • H.264 High: avc1.6400xx, where xx is the AVC level

  • MPEG-4 Visual Simple Profile Level 0: mp4v.20.9

  • MPEG-4 Visual Advanced Simple Profile Level 0: mp4v.20.240

The profile part is easy, because when you use conversion tools, most have presets predefined for each of the profiles. However, the AVC level isn’t as simple to discover. According to a paper on H.264 (a PDF is available at http://www.fastvdo.com/spie04/spie04-h264OverviewPaper.pdf), the AVC level is based on picture size and framework, and also added constraints for picture number reference and compression rate.

In Example 1-8, several different video files are listed in individual source elements, with both the codec and container information in the type attribute. The two H.264 videos represent a desktop capable video encoded with the main profile, while the mobile version is encoded with the baseline profile. The user agent in each environment traverses the list of source elements, stopping when it reaches a container/codec and profile it supports.

Example 1-8. Several video sources, each with different container/codec strings in the source type attribute
<!DOCTYPE html>
<head>
   <title>Video</title>
   <meta charset="utf-8" />
</head>
<body>
   <video controls>
      <source src="videofile.mp4"
                 type='video/mp4; codecs="codecs="avc1.4D401E, mp4a.40.2"' />
      <source src="videofilemobile.mp4"
                 type='video/mp4; codecs="avc1.42E01E, mp4a.40.2"' />
      <source src="videofile.webm"
                 type='video/webm; codecs="vp8, vorbis"' />
      <source src="videofile.ogv"
                 type='video/ogg; codecs="theora, vorbis"' />
   </video>
</body>

Note

The Android Developer SDK documentation contains a listing of supported media types and recommended encodings at http://developer.android.com/guide/appendix/media-formats.html. The iOS Developer Library has a “Getting Started” section for audio and video at http://developer.apple.com/library/ios/#referencelibrary/GettingStarted/GS_AudioVideo_iPhone/_index.html. The announcement of an integrated IE9 into Windows Phone 7, including HTML5 media support, can be found at http://blogs.msdn.com/b/ie/archive/2011/02/14/ie9-on-windows-phone.aspx.

Converting Audio and Video Content

I have a little Flip video camera that I use to take videos. It’s a cute little thing, and easy to use. Unfortunately, it’s been discontinued because so many smart phones have built-in video capability that meets or exceeds the Flip’s capability.

My Flip, and most video phones and other cameras, take video in the MP4 format. In addition, most of our devices now support HD video, which means large video files that may or may not be useful for web access. Once you have a video—your own, or a CC or public domain video you found online—you need to provide conversions of the video for all of your target browsers and environments. This means, on average, creating smaller or edited versions of the video, and converting the resulting video into either Ogg or WebM format (or H.264 if the video is a WAV or other video format).

In addition, you may have a WMA (Windows Media Audio) file that doesn’t play on the web, which you need to convert into a web-friendly format.

The number of tools to edit both audio and video files can fill a book, so I’ll leave that for another book. Instead I’m going to introduce you to some useful tools you can use to create video and audio conversions for various browsers and environments.

The Free Mp3/Wma/Ogg Converter

The tool I used for most of the audio conversions for this book is the Free Mp3/Wma/Ogg Converter, by Cyberpower. This tool is extremely easy to use, and can convert one audio file or do batch conversions.

Note

Download the Free Mp3/Wma/Ogg Converter from http://www.freemp3wmaconverter.com/.

When you start the tool, you’re presented with a blank workspace, and buttons to the right for adding source video files. The tool can work with Ogg Vorbis, WMA, and MP3 source files, and you can add more than one source file, as shown in Figure 1-5.

Clicking the Next button at the bottom of the window leads to the next page, where you can select from several audio output formats. For instance, you can pick from a list of Ogg Vorbis quality conversion choices, as shown in Figure 1-6.

Adding files for conversion to the Free Mp3/Wma/Ogg audio converter
Figure 1-5. Adding files for conversion to the Free Mp3/Wma/Ogg audio converter
Selecting Ogg container and Vorbis codec, as well as quality in Free Mp3/Wma/Ogg converter
Figure 1-6. Selecting Ogg container and Vorbis codec, as well as quality in Free Mp3/Wma/Ogg converter

Video Conversion with Miro Video Converter and Handbrake

For video, I use two different tools: Miro Video Converter, and Handbrake.

The Miro Video Converter is even simpler to use than Free Mp3/Wma/Ogg. When you open the application, you have a space where you can either drag a file for conversion, or open the utility to find the file, as shown in Figure 1-7.

The Miro Video Converter, when first opened
Figure 1-7. The Miro Video Converter, when first opened

After loading a source file, you then pick which container and profile you want the video file to be converted to. In Figure 1-8, I’ve selected the Android Droid preset.

It can take a little time to do the conversion, depending on how much juice your machine has. However, the simplicity of the conversion process makes it an ideal tool for those just getting started.

After selecting the Android Droid preset in Miro Video Converter
Figure 1-8. After selecting the Android Droid preset in Miro Video Converter

If you want a little more sophistication with H.264 files, I recommend Handbrake. It doesn’t do any conversions to WebM or Ogg, but it does give you more finite control over your H.264 conversion, especially for web content.

Note

The Miro Video Converter can be found at http://www.mirovideoconverter.com/. Downloads and documentation for Handbrake can be found at http://handbrake.fr/.

Once you start up Handbrake, you’ll need to provide a source file. This can be a DVD (unless protected), or it can be a video file or folder, as shown in Figure 1-9. In the figure, the source file is the Ogg Theora 854 × 480 video of Big Buck Bunny.

Loading an Ogg Theora source video file into Handbrake
Figure 1-9. Loading an Ogg Theora source video file into Handbrake

At this point you can choose a preset in the right column, or you can manually choose your settings from several different tab pages. For instance, you can set the video resolution (width and height) and aspect ratio in the first tab page, and in the tab page labeled “Video” you can pick the framerate and bitrate, as well as selecting the 2 pass encoding option. You can adjust the audio in the tab page labeled “Audio”, and add subtitles via the tab page labeled “Subtitles”. If you want to provide closed captioning within the file, this is where you’ll add the subtitles. You can even choose to “burn in” the subtitles if you want them to be available for everyone.

For the sample, I picked the iPhone & iPod Touch preset. Doing so removed some options, such as the 2 pass Encoding, which I would normally pick if the preset allows it (or I was manually setting all of the encoding values). I also checked the Web Optimized option, which means that the video loads progressively (the video can start playing before it’s completely loaded.) Always, always, pick the Web Optimized option if the file is meant for HTML5 video access.

Figure 1-10 shows the front page of Handbrake during the encoding process.

During the conversion of the Ogg Theora file to MP4 using Handbrake
Figure 1-10. During the conversion of the Ogg Theora file to MP4 using Handbrake

Using a Frame Grabber

One last tool I used for the book is a frame grabber. Frame grabbers allow you to traverse through your video file while the video is running, or frame by frame, or both. You can then grab a static copy of whatever frame interests you. I used the freely available Avidemux as a frame grabber (though it does more than just grab frames), to get a static image for the poster attribute example earlier in the chapter.

Note

You can download Avidemux and access the Wiki documents at http://avidemux.sourceforge.net/.

Once Avidemux is opened, select and open your video file. Unless the frame is close to the beginning of the file, click the play button to run the video to the approximate place in the video for the frame you’re interested in. Stop the video, and then use the frame buttons to move forwards or backwards, a frame at a time, to find the frame you want, as shown in Figure 1-11.

Once you have the frame you want in the viewfinder, select the File menu option, then the Save and Save JPEG menu options, and give the tool a location and file name for the grabbed frame in the dialog that opens. Avidemux only saves frames as BMP or JPEG, but the JPEG should be sufficient for a frame that you want to use as a poster image for your video.

Finding and grabbing a frame to use as a poster image with Avidemux
Figure 1-11. Finding and grabbing a frame to use as a poster image with Avidemux

Once you have your static frame copy, you can then edit it in your favorite image editor, and add text or other effects. Even though the file is a JPEG, and a lossy compression format, opening the file once, adding effects, and saving a new copy won’t degrade the copy enough to be noticeable.

There are dozens of tools for every environment for creating, editing, and converting audio, video, or both. My recommendation is try out several to see which ones work for you. Once you have your files, and your basic HTML5 (including all of the containers/codecs you want to support), check out what you can do with HTML5 audio and video out of the box, in Chapter .

Note

Mark Pilgrim provides excellent coverage of the HTML5 media elements, the codecs (and their related issues), as well as tools—including some open source command line tools good for batch processing—either in his book, HTML5: Up and Running (O’Reilly), or freely available at http://diveintohtml5.org/video.html.

Get HTML5 Media now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.