Introduction

Before jumping right into the best practices, let’s take a brief moment to answer the question: what exactly is an EPUB?

If you’re already familiar with the inner workings of the format, whether from creating EPUB 2 content or experimenting with EPUB 3, you can safely skip ahead to Chapter 1, but this introduction will take everyone else through a quick tour of the format (at the macro level, instead of the micro level to come) to see how the pieces fit together.

Since you’re reading a book about EPUB, you must already be familiar with the term, but you may have seen or heard it incorrectly being used as a synonym for ebook (as a shorthand for talking about electronic books). Although the two terms share a common relation in electronic book production, they aren’t interchangeable. EPUB is a format for representing documents in electronic form. Ebook, on the other hand, is just an abstract term used to encompass any electronic representation of a book, including formats such as PDF, HTML, ASCII text, Word, and a host of others, in addition to EPUB.

EPUB is designed to be a general-purpose document format, and it can be used to represent many kinds of publications other than just books: from magazines to newspapers to journals, and on through office documents and policies and beyond. Just about any document type you want to distribute electronically can be represented as an EPUB. Likewise, this book is not just about how to create books in electronic form, but how to optimally use the EPUB format for any content production. A natural bias to book production will be evident at times, but recommendations should be read as publication-agnostic.

On a practical level, EPUB defines both the format for your content and how reading systems go about discovering it and rendering it to readers (we’ll avoid the word display for what a reading system does with content, because EPUBs aren’t only for the sighted and don’t contain only visual content).

But perhaps the best way to understand what goes into an EPUB is to quickly break down the creation process:

  1. The first step in making an EPUB is to create your content document(s). These must be either XHTML5 documents, SVG images, or a mixture of the two. Chapter 3 begins looking at the issues involved in creating these documents.
  2. Once you’ve crafted your content, the next step is to create the package document, a special document used by reading systems to glean information about your publication (for ordering in your bookshelf, to render the content, and the like). The first step in creating this file is to list all of the resources you assembled in the content creation step in the manifest section of the package document. Reading systems need this list to determine whether a publication is complete and to discover which remote files will have to be retrieved. All your publication metadata (title, author, etc.) also goes in this file, consolidating it in a single, common location so that it can be easily extracted and used in distribution channels and by reading systems. You also have to include the default reading order in the spine section (a sequential list of your content files, from the first one to display to the last). Understanding metadata and packaging is key to understanding the EPUB format, as you might imagine, and that’s why this book begins by exploring these issues in Metadata.
  3. The last step is to zip up your content documents, associated resources, and the package document into a single file for distribution. This process isn’t quite as simple as a standard zipping, however: a special mimetype file has to be added first to indicate that your ZIP file contains an EPUB and not something else, and a file called container.xml has to go in a directory named META-INF to tell reading systems where to find your package document.

This manual process is not one you will typically carry out in full, because there are programs that allow you to focus on creating your content while taking care of the export and packaging for you. It’s invaluable to get clear in your head, though, because content and the package document are interrelated in many ways that will be explored throughout this book.

Note

If you read the previous numbered list in reverse, you’ll also understand how reading systems work: they examine your ZIP container, determine it’s an EPUB, find the package document, and from there discover how to render the resources to readers.

The other aspect of EPUB to understand before getting started is that it draws many of its capabilities and its versatility from web technologies, but the Web alone doesn’t tell the whole story of EPUB. Without the complementary technologies the EPUB format brings under its common umbrella, the ability to create distributable publications would be much more complex.

Some of the technologies used in EPUBs have been specially developed by the International Digital Publishing Forum (IDPF), but most of the standards that have been leveraged are internationally recognized. The key ones you’ll find in EPUB 3 publications include:

XHTML5
For representing text and multimedia content, which now includes native support for MathML equations, ruby pronunciation markup, and embedded SVG images
SVG 1.1
For representing graphical works (for example, manga and comics)
CSS 2.1 and 3
To facilitate visual display and rendering of content
JavaScript
For interactivity and automation
TrueType and WOFF
To provide font support beyond the minimal base set that reading systems typically have available
SSML/PLS/CSS3 Speech
For improved text-to-speech rendering
SMIL3
For synchronizing text and audio playback
RDF vocabularies
For embedding semantic information about the publication and content
XML
A number of specialized grammars facilitate the discovery and processing aspects of EPUBs
ZIP
To wrap all the resources up into a single file

You’ll learn more about how to use all of these technologies as you progress through the chapters.

The EPUB format is specifically designed to be free and open for anyone to use without having to sift through a litany of patent encumbrances and restrictions. EPUB’s widespread adoption has been due in no small part to the fact that basic text editing tools can be used to create publications, and the EPUB 3 revision of the specification has not deviated from this core tenet.

But that’s really all there is to an EPUB file under the hood. If you feel comfortable with the concept of an EPUB as a predictable, discoverable container of your content, you’re ready to begin tackling the best practices.

Get EPUB 3 Best Practices now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.