Chapter 2. Building a Better EPUB: Fundamental Accessibility

Semantic Inflection

I’m not going to rehash the reasons for semantic markup again, but I intentionally neglected getting into the specifics of how they’re added in EPUB 3 until now so as not to confuse the need with the technical details.

Adding semantic information to elements is actually quite simple to do; EPUB 3 includes the epub:type attribute for this purpose. You can attach this attribute to any HTML5 element so long as you declare the epub namespace. The following example uses the attribute to indicate that a dl element represents a glossary:

<html … xmlns:epub="http://www.idpf.org/2007/ops">
    …
     <dl epub:type="glossary">
         <dt><dfn>Brimstone</dfn></dt>
         <dd>Sulphur; See <a href="#def-sulphur">Sulphur</a>.</dd>
     </dl>
    …
</html>

Whenever you use unprefixed values in the attribute (i.e., without a colon in the name), they must be defined in the EPUB 3 Structural Semantics Vocabulary. All other values require a defined prefix and are typically expected to be drawn from industry-standard vocabularies. In other words, you cannot add random values to this attribute, like you can with the class attribute.

You can create your own prefix, however, and use it to devise any semantics you want, but don’t create these kinds of custom semantics with the expectation they will have an effect on the accessibility or usability of your ebook. Reading systems ignore all semantics they don’t understand and don’t have built-in processing for. It would be better to work with the IDPF or other interested groups to create a vocabulary that meets your needs if you can’t locate the semantics you need, as you’re more likely to get reading system support that way.

The attribute is not limited to defining a single semantic, either. You can include a space-separated list of all the applicable semantics in the attribute.

A section, for example, often may have more than one semantic associated with it:

<section epub:type="toc backmatter">
    …
</section>

The order in which you add semantics to the attribute does not infer importance or affect accessibility, so the above could have just as meaningfully been reversed.

You should also be aware that this attribute is only available to augment structures; it is not intended for semantic enrichment of your content. Associating the personal information about an individual contained in a book so that a complete picture of their life can be built by metadata querying, for example, is not yet possible. The metadata landscape was considered too unstable to pick a method for enriching data, but look for a future revision to include this ability, whether via RDFa, microdata, or another method.

And in case it needs repeating, semantics are not just an exercise in labeling elements. As I discussed in the introduction to this section, these semantics are what enable intelligent reading experiences. If you had 25 definition lists in an ebook each with a particular use, how would a reading system determine which one represents the glossary if you didn’t have a semantic applied as in the first example? If you know which is the glossary, you could provide fast term lookups. The easier you make it for machines to analyze and process your data, the more valuable it becomes.