Sunday, January 29, 2012

How To Create Fixed-Layout iBooks, Part 6

We now come to the final file we will need to make our fixed-layout iBook complete... at least in terms of functionality, that is. But once you build this file you can load your ebook onto the iOS device of your choice and see it in action. And that's when the real work begins. For now we'll focus on the final puzzle piece required, the content.opf.

As its name suggests, the content file is a descriptive listing of the ebook's physical contents, that is, the actual component parts included in the archive, not its literary or artistic content. OPF stands for open packaging format, which refers to the specification that defines the structure and semantics of the package. There are four essential elements that make up the content.opf file, each of which we'll take in turn.

First, however, you will notice that a new declaration element has replaced the previous !doctype and html namespace references:
<package xmlns="http://www.idpf.org/2007/opf"
       unique-identifier="book-id" version="2.0">
This provides a reference to the official opf spec at the International Digital Publishing Forum website. Those are the good folks who have been piecing this thing together over the years and working diligently to keep it up to date (not an easy task at the rate technology is changing).

The second element in this declaration is the unique-identifier which you should recall we made a reference to in the toc.ncx metadata section. The dtb:uid entry there will show up again in a moment, linking it with the reference here. The "book-id" element is sometimes written as "BookId" or some such instead. It's just a reference, so it can really be anything you like, so long as it matches what we enter below.

1. Metadata

The first section is where the metadata for your ebook content lives - and this time I do mean artistic and literary content. This section can be just a few short lines giving just the bare essentials of title, language, and identifier (the only ones technically required), or add a host of other information that can be useful for identifying and cataloging your title. In this, more is always better, as individual systems can ignore non-relevant portions, but cannot make them up if not provided.

Before we get to the metadata proper, however, we must declare our reference systems, of which we'll be using two:
<metadata xmlns:dc="http://purl.org/dc/elements/1.1/"
       xmlns:opf="http://www.idpf.org/2007/opf">
Dublin Core (dc) is the primary set of metadata elements in use in ebooks, but the opf spec add some specifics that are useful for fine-tuning our statements, so we declare both here, even though we've already referenced the opf spec above. The general practice is to state the <dc element followed by an opf specifier. So, for example, you might include this line of metadata:
<dc:identifier opf:scheme="ISBN">ISBN#</dc:identifier>
where the generic Dublin Core identifier is defined as an ISBN number via the opf:scheme. You would, of course, replace the temp ISBN# entry with your own actual data, given that you have one. An ISBN is required in order to upload directly to the iBookstore, but many aggregators will supply them for you for a fee.

<identifiers>

Other identifiers can be used instead of, or in addition to, an ISBN, and can be almost any unique string of data, such as a website URL or UUID. I've included the "book-id" identifier in the sample template, but here is where you would insert the UUID mentioned previously in the toc.ncx lesson. The idea is just to include some string of data which is unique to this specific incarnation of your work, and ideally includes some type of version number, since ebooks don't often have specific editions as print books do. Each time you update the ebook's content, however trivial, you should alter this data string, preferably in a logical and expressive way that will allow users (i.e. collectors of your awesome body of brilliant work) to identify the particular version they are holding. The UUID is particularly useful in that it can be decoded to discover the exactly moment and location of creation, but looks like utter gibberish otherwise.

<title> <creator> <publisher>

You are required to include a title, for obvious reasons, but not an author, as works can be anonymous. The dc:creator element has a wide range of specifiers, including such functions as author, illustrator, editor, etc., and there can be multiples of each. These can be added using either the id tag or an opf specifier, or both. There's also a publisher element for that entity, and you can add generic contributor elements as well for any others whose roles remain unspecified. I won't go into all the possibilities, as you can visit the referenced websites in the declaration above for complete listings of your options.

I will mention here, however, the opf:file-as tag, as this allows you to specify how you want your work to be listed in catalogs, which is generally last name first. If you leave this out your book will be listed under your first name in iBooks, which is totally lame and cries out amateur.

<date>

There are several opf events you can use for your date entry, including date of creation, copyright, and publication, of which you can include one, all, or none. Your choice. You can also just include the year and leave off the month and day if you like, but if you include the month and day you must use the year-mth-day format of ####-##-##.

<rights>

A statement of your rights is allowed, and just typically states All Rights Reserved (or creative commons, public domain, or whatnot). You can be as specific or general as you want here. In all of these entries, by the way, any of the id="en_whatever" tags can be included or removed. I have included these to show where you would add specific entries for different territories in which you plan to distribute your work, for example if you're doing translations into different languages or reserving/selling specific rights in different countries. In most cases, if you're getting this involved you'll want to consult a literary agent or legal representative who knows publishing law.

<language>

The third and final element you are required to include is a language tag, and this should employ the standard RFC 3066 Unicode language identifiers, using either the base two-letter code, or that plus a secondary string. So, for example, English can be either plain en, or en-us for United States dialects, or en-gb for Great Britain strains, or any of a host of others. Generally just the base language is all that anyone needs, but you never know. You can find the codes online.

<type> <subject>

Using the type element you can enter category data such as whether the work is fiction, non-fiction, poetry, etc., and/or a specific genre or classification, such as historical fiction or art history. Subject, on the other hand, is where you would add Library of Congress headings, or other subject info, such as those used by Amazon to categorize their entries. One reason you're adding all this extra information is precisely so that retailers can add it to their product pages. Adding it here facilitates the quick and accurate transfer of metadata concerning your work, and this is your chance to make sure it's right.

<description>

This is where you would place the back jacket blurb or other descriptive content that tells the reader what the book is about. Anything you might desire a potential customer to know could go here, including reviews, extracts, or a general description such as what you would read on any book page. Give it some thought as it will show up all over the Internet on every ebook retailer, and once it's there it's there for good.

<format>

One other tag you might include is format (iBooks in this case, but ePub or Kindle or whatever elsewhere). This might seem redundant since you've got the ebook right here in front of you, but not everyone reading this data will, and it's one more way this specific iteration of your work can be identified. For example, a library may be looking at a metadata listing in search of a particular format to include in their catalog, and other general ebook retailers will want to identify the format for their customers before selling it to them via whatever systems are put in place down the line as ebooks become more common.

2. Manifest

Just like a shipping invoice, the manifest lists all the items included in the package. Every file in the OEBPS folder must be listed here, with the exception of the content.opf itself. In addition, any files in the META-INF folder are excluded, since in order to get to the manifest the system will already have employed those files.

Each item gets its own entry, starting with item id that includes a descriptive name of your choice. This is followed by a media-type and an href which gives the file's location. Either can come first, but both must be included.

The media-type tells the system what kind of file the item is, and must be correct for the item to function (the file extension isn't enough in itself, apparently). I have provided the main media types you might use, although for images you can also have image/png or image/tif files. Note that all html files are listed as application/xhtml+xml, regardless of what extension you use for the actual file itself (.xhtml, .html, .xml, etc.). You can also use OpenType and SVG fonts in addition to TrueType.

3. Spine

The spine is a linear listing of the ebook contents in the order they will be presented, just as the pages in a print book are attached in specific order to the spine (hence, the name). Here you enter each html page you create using the item id you specified in the manifest above, as such:
<itemref idref="item1"/>
where the idref is equal to the item id in the manifest. Only the html pages themselves need be entered, and not their component parts (i.e. css, images, etc.). Just list them all in the order you want the reader to see them.

4. Guide

The iBooks asset guide states that this section is required, but I've seen iBooks work just fine without it. For fixed-layout ebooks it's altogether an irrelevant set of data. In standard ebooks this is where the drop down menu items are entered, but in fixed-layout iBooks that function is replaced by a set of thumbnails along the bottom and a drop down menu that leads you to a thumbnail grid. However, for the sake of completeness, I'll describe it anyway, as it might come into play in subsequent updates down the line (the Kindle KF8 format still employs it, for example).

Whereas the spine is a logical listing of every page in the ebook, the guide is a list of major waypoints along the way, such as chapters, appendices, table of contents, etc. As mentioned earlier, this is equivalent to the NavMap in the toc.ncx, and should essentially mirror it, although the NavMap can contain much greater detail since it allows page anchors. The guide, by contrast, can only handle major page divisions, and if included should have at minimum an entry for the cover (type="cover"), table of contents (type="toc"), and the first page of text (type="text"), as these three type tags have some built in functionality in ebooks: text, for example, shows up in the drop down menu in Kindle ebooks as "Go To Beginning" and the toc tag in either takes you to the html contents page. Additional entries for an index or appendices prove useful where this works, so if for some reason you decide to make a standard reflowable ebook, now you know what this section's for and how to use it. However, for fixed-layout you can just as easily leave it out.

TEST YOUR EBOOK!

You now have enough content and supporting files to load your ebook into iBooks via iTunes and give it a test run. If anything goes haywire you'll generally get an error message on the relevant page giving you a line and column reference to the offending element. Of course, if it doesn't load at all you'll have to backtrack and work out your error. I've tried to make this as easy as possible by providing a working template into which you can simply add your own content.

Of course, all you have so far is a book of pictures, which is fine if you're making a photo album, but you'll likely want to add some text even to that, so in the next installment I'll discuss embedding fonts and adding active text to your growing book.

14 comments:

  1. Thank you for this information!

    ReplyDelete
  2. Thanks for sharing your work. This is great. I am creating my ebook with full bleed images and text, and so far I am on the tenth page. Now I am wondering how can I take this to the next level by adding audio. I'll try to add a code inside each page to trigger an mp3 file.

    ReplyDelete
  3. Glad to hear the info has been useful. You can embed m4a or mp3 audio into your HTML pages with the tag:

    audio src="audio/filename.mp3" controls="controls"

    (with the standard angle brackets and closing tag, of course). This will insert a mini audio player into your page at the insertion point. Be sure to include your audio files in the manifest with the media type specified, such as:

    item id="audio1" href="audio/filename.mp3" media-type="audio/mp3"

    You can also create a background soundtrack that plays automatically by inserting the code:

    epub:type="ibooks:soundtrack"

    into your HTML tags between the words "audio" and "src". This will add a switch to the menu that allows the user to turn the feature on or off. I'll try to do a detailed post about this at some point, as there are a lot of variable options, but for now you can find out more by getting Liz Castro's "Read Aloud" miniguide, which covers the subject and comes with a really nice sample ebook file.

    Best of luck, and keep me posted on your project.

    ReplyDelete
    Replies
    1. Dear Scot!

      I use the soundtrack feature in my epub. The downside of this is when I want play another sound on the page, the soundtrack stops. After the other sound finished playing the soundtrack won't resume, but after flipping page. I can't document.getElementById the soundtrack as it pops out of the DOM, so I couldn't find a way to resume playing. I couldn't figure out how to play to audio simultanously, neither. If one starts the other stops.

      Do you know if it is actually possible or not?

      Delete
    2. B,
      It is possible to play both simultaneously, as I've seen it done, although I haven't actually done it myself. I'll do some tests and see what I can figure out. I've been meaning to add some audio to my project, so now might be a good time to try it out.

      Delete
    3. Dear Scot!

      It turned out that when there is an SMIL file, the iBooks can play soundtrack, SMIL data and simple audio tag simultaneously. Hope this will save you hours of experimenting.

      Vili (I was that B before...)

      Delete
    4. Ah! That is good to know. Thanks for sharing your discovery. It will indeed save me a lot of time.

      Delete
  4. Dear Scott
    Many thanks for sharing this information.
    I am a mere author of children's books. I originally wrote them in MS Word. These books are illustrated, with an image on every other page.
    I have also saved them in pdf format.
    I tried converting my docs into epubs using epubmaker but the results were disastrous. Apple's ibookstore won't accept any of those epubs.

    I don't want to start from scratch with each of my children's books. Is there a quick and dirty way (but successful) of converting my docs to epubs, without hiring a conversion service?
    I can open my books in MS Word and would like my ebooks to look like the MS word counterpart.
    Just FYI, i have about 180 children's books to convert in 9 languages, and I am trying to find a workable solution for me.
    Any help you can give me will be appreciated.
    Thanks
    bkdesynr at gmail

    ReplyDelete
    Replies
    1. Your best bet for doing bulk conversions is to use Calibre (http://calibre-ebook.com). It doesn't accept .doc files directly, but you can save your Word file to HTML (Filtered), which is what EPUB is based on. From there you can import the files into Calibre and convert them individually or in bulk into a dozen different formats.

      The conversion isn't perfect, and depends a lot on how clean your Word formatting is (using consistent Styles for headers and sections, for example). Look at the Smashwords Style Guide for a thorough introduction into how to create really clean Word files.

      Even so, you will probably need to massage your epub files manually a bit in order to fix any anomalies that creep in. For this, you can use either a line-numbered text editor such as Notepad++, a webpage program like Dreamweaver, or an ebook editor such as Sigil (http://code.google.com/p/sigil/).

      If your input file is good you shouldn't have to do much for a children's book, depending on what you're doing with your images. If the html you output from Word looks good then you should be fine. Good luck!

      Delete
  5. Hello Scott,

    Thank you for publishing your posts on Fixed Layouts; they have been really, really useful. The step-by-step detail is a godsend for newbies like me.

    After following your posts, I managed to get my first epub file validated and working. However now (after making some content changes that really shouldn't have had any impact on the broader file) I get an error informing me that my container.xml file is missing (it isn't missing; it's where it should be, and hasn't changed).

    I'm searching through your blog trying to find if you've covered the actual zip to epub conversion part? Best I can tell there seem to be issues surrounding the order in which files are added to the zip file. While I have just bought the latest (2.5 / HD) iPad, I'm using a PC for zipping, and nothing that I've found when Googling for a solution has worked (for me).

    Have you covered this area? If not, any pointer(s) no matter how small would be very much appreciated. If this is a bit off-topic for you, then no problem ...as you've already been a massive help.

    Best wishes - David

    ReplyDelete
    Replies
    1. Hello David. I cover zip compression to some degree in part 2, but not extensively. It has been my experience that once you get the basic file structure in place and functioning it's best to work inside the archive. Using 7-Zip you can make changes to the files within the zip, and add new files with drag and drop, rather than unzipping and zipping the whole archive again, which I understand can sometimes cause problems. I've always worked inside the archive, or simply dragged new files into it, and have never had any problems. Since your file structure and names haven't changed, my guess is you've got a compression error and you'll need to backtrack to the working file and add your revised content to that without rezipping it.

      Delete
    2. Also I should mention that if you're replacing a file inside an archive with a revised version be sure to delete the old one first before adding the new one, as overwriting compressed files can glitch the data.

      Delete
  6. Hello Scot (just the one t this time)...

    Thanks for such a quick reply. Yes it's there in part two (thank you). I followed your instructions when I was trying to get my earlier fixed layout file to work (which it did). I started using 7-zip today, but that may be after the horse has bolted.

    Thank you for the pointer. I'll start back-tracking, which probably means re-doing the content alts. I know this probably seems easy to a seasoned ePubber like you, and a million coders out there ...but for something that should be reasonably simple, it's pretty contrived and pernickity. I wish the likes of Adobe would get their act together with InDesign, and deliver a professional tool.

    Thanks - David

    ReplyDelete
  7. Ahh... no, you see I didn't do that either Scot. Guess I'd better take stock, and try to think like a bit of capricious code.

    ReplyDelete