epub - DIUF

7 downloads 135 Views 2MB Size Report
Feb 23, 2011 - version is 2.0.1 and a new 3.0 version is already in development. ...... [Alpeyev / Miller 2011] Alpeyev,
EPUB: NEW OPEN STANDARD IN E-PUBLISHING

STUDENT NAME:

Blazej Blazejewski

STUDENT NUMBER:

04-215-125

COURSE NAME:

E-Business

DEPARTMENT:

DIUF

SUPERVISORS:

Andreas Meier / Luis Terán

DATE OF SUBMISSION:

17 05 2011

ABSTRACT The aim of this paper is to analyze the EPUB file format used to produce and distribute e-books. The paper is divided in two parts. The first part is devoted to the technical aspects of EPUB files. After a brief historical introduction, the current 2.0.1 EPUB specification is thoroughly discussed and a fictional and simplified EPUB file is described in detail to provide a practical example. The future 3.0 EPUB specification is briefly presented afterwards, and the first part is closed by a critical evaluation of the EPUB file format. The second part puts the EPUB file format into a broader economic perspective of the e-publishing value chain. For this reason, the second part starts with the description of e-publishing and the e-publishing value chain. This introduction is followed by a presentation of advantages of EPUB files over traditional printed books with a special focus on production, distribution and pricing. Finally, the second part of the paper compares the EPUB file format with other popular file formats used in e-publishing. Keywords: EPUB, PDF, e-books, publishing, e-publishing

2

CONTENTS ABSTRACT ...................................................................................................................................................................... 2 LIST OF FIGURES .......................................................................................................................................................... 5 LIST OF ABBREVIATIONS .......................................................................................................................................... 6 FOREWORD .................................................................................................................................................................... 7 INTRODUCTION ............................................................................................................................................................ 8 PROBLEM STATEMENT..................................................................................................................................................... 8 OBJECTIVES .................................................................................................................................................................... 8 OUTLINE ......................................................................................................................................................................... 9 CHAPTER ONE ............................................................................................................................................................... 9 EPUB HISTORY ............................................................................................................................................................... 9 CURRENT 2.0.1 EPUB SPECIFICATION .......................................................................................................................... 10 Preliminary notes ..................................................................................................................................................... 11 General structure ..................................................................................................................................................... 11 MIME type ............................................................................................................................................................... 12 META-INF directory ................................................................................................................................................ 12 Container file ........................................................................................................................................................... 13 OPF package ........................................................................................................................................................... 13 NCX file.................................................................................................................................................................... 15 ?>

Figure III. The content of the container.xml file.

OPF package The OPF package file is simply an XML file constructed according to the OPF specification. By convention, however, it is given the .opf file extension in order to distinguish it easily from other files in an EPUB file. The content.opf file serves four specific purposes. First of all, the OPF package file contains meta unique-identifier="bookid" xmlns="http://www.idpf.org/2007/opf" xmlns:dc="http://purl.org/dc/elements/1.1/"> 123myuniqueidentifier321

Figure IV. The content of the content.opf file. 14

The content of the content.opf file is presented in the Figure IV. The meta version="2005-1"> Hello World Title Chapter 1

Figure V. The content of the toc.ncx file.

The Figure V presents the content of the toc.ncx file of the Hello World e-book. The NCX file is only composed of the three obligatory parts. The table of contents has one hierarchical level with two navigation points. The first one points to the title page of the book and the second one points to the chapter 1 of the book.

> Hello World: title Hello World by Blazej Blazejewski

Figure VI. The content of the title.html file. The figure VI presents the content of the title.html data file. It can be seen that the actual content of the Hello World e-book is very much like a web page with the text of the book. The content of the chapter1.html and stylesheet.css files is not presented, these two files being simply typical XHTML and CSS files.

ZIP file Once the content of an EPUB e-book is prepared, all the data and metadata files should be compressed into a ZIP file. The ZIP data compression format was originally created in 1989 by Phil Katz and has subsequently become a common standard for data compression [PKWARE 2011].

17

The OCF specification includes some special rules concerning the packaging of the EPUB e-book into a ZIP package. The two following rules are the most important. Firstly, the mimetype file should be the first file in the ZIP package and actually must not be compressed. Secondly, the ZIP file itself must not be encrypted (the encryption of the content of the EPUB file, however, is possible and can be announced in an encryption.xml file in the META-INF directory).

When the ZIP file has finally been created, all that needs to be done is to change the extension of the file from ZIP to EPUB. The EPUB e-book is now ready and can be read by EPUB compatible ebook readers.

Final comments The example used in this chapter describes the creation of an EPUB file by manually editing each file that composes the final EPUB e-book. This is the most straightforward approach, requiring only a text editor and an archiving utility that allows adding uncompressed files to a ZIP package. On a Windows machine, the standard Notepad application is merely enough to prepare the content of an EPUB e-book. The embedded Windows archiving utility, on the contrary, is of no use as it does not allow adding uncompressed files to ZIP files. A more advanced archiving utility is required, such as the freeware 7-Zip utility. Manually editing EPUB files can become tedious when the book volume increases. In this case, some special EPUB creation utilities can be used to automatically create EPUB files from other data files. A popular choice is the freeware Calibre software that allows multiple conversion possibilities of different data formats. In an enterprise environment, the professional-grade Adobe InDesign software allows exporting documents to EPUB files.

Future 3.0 EPUB specification While the 2.0.1 EPUB specification was released in September 2010, the original 2.0 EPUB version dates from as early as 2007. In order to further improve and develop the EPUB file format, a new, third release of the EPUB specification is actually being prepared. Although the 3.0 version is still work in progress, the drafts of the future 3.0 EPUB specification are already available at the IDPF website (see [IDPF 2011]). This section gives a quick overview of the principal modifications that, according to the drafts, will be included in the new 3.0 EPUB specification.

Structural modifications The new 3.0 EPUB standard will be based on four specifications. The Open Container Format 3.0, just like the current OCF, will define the general structure of EPUB files. The Publications 3.0 specification will succeed the current OPF and will specify the metadata organization of EPUB files. The Content Documents 3.0 specification will replace the current OPS and will define the 18

actual data content on an EPUB file. Finally, the new Media Overlays 3.0 specification will be used to provide a synchronization of text data and audio data. The new 3.0 EPUB specification will also abandon the use of NCX files as a mean of providing indocument navigation information. The NCX solution will be replaced by EPUB Navigation Documents defined in the new Content Documents 3.0 specification.

Content modifications The content of the 3.0 version of EPUB files will be based on the new HTML 5 web standard. As such, the new 3.0 EPUB files could include audio and video elements introduced in HTML 5 “audio” and “video” tags. To facilitate the presentation of such multimedia content, the 3.0 EPUB draft specification allows the use of triggers and scripts to increase the users’ interaction possibilities. Moreover, the 3.0 EPUB specification will support the Mathematical Markup Language (MathML). MathML is an open W3C standard (see [W3C 2011e]) used to easily express complex mathematical notation. This move will foster the use of EPUB files by the scientific community. On the other hand, the new EPUB specification will abandon the support of the DTBook DAISY standard, as it overlaps with the possibilities provided by HTML 5.

EPUB format evaluation The EPUB file format presents some interesting characteristics that have largely contributed to its growing popularity. This last part of the first chapter provides a critical evaluation of the main characteristics of the EPUB file format, pointing out its main strengths and weaknesses.

Web standards EPUB files are entirely based on well-known and successful web standards XML, XHTML and CSS. This makes EPUB files easy to create and easy to handle, as it does not require learning any additional programming languages.

Open standard The EPUB standard is an open standard developed by an independent organization. The complete specification of EPUB files is publicly available online at no cost. Moreover, all the other standards used by EPUB (i.e. XML, XHTML, CSS and ZIP) are also public and free to use. This means that the EPUB standard can simply be employed by all interested companies or individuals, without any legal or economical obstacles.

19

Multiple specifications The EPUB file format is formally regulated by three separate specifications. Each specification defines a different aspect of the EPUB standard, but all three specifications are heavily linked with each other without any formal hierarchy. Even though it might be a good idea to regulate different matters in different documents, this separation makes the whole EPUB specification much more difficult to apprehend by new or inexperienced users.

External specifications Moreover, the EPUB specification relies in part on external specifications that were initially developed for other purposes. Again, while reusing existing specifications might sometimes be a good idea, these external references make the EPUB specification less clear and can pose coherence problems with native EPUB rules. In this regard, the abandon of the external NCX specification by the coming 3.0 EPUB version is a welcome improvement.

Hardware independent The EPUB file format was not developed for a specific hardware device. On the contrary, it can be used on any computer machine, be it mobile or stationary, with any operating system. This makes the EPUB file format particularly versatile and portable. One single EPUB file is enough to make an e-book accessible to consumers with a variety of reading devices. In addition, the very same EPUB file can still be used when a consumer decides to change his hardware device or his operating system.

Reflowable content As EPUB files were not designed to display on a specific hardware machine, they can be displayed on any screen of any reasonable size. The text content of an EPUB file is reflowable: it means that it adapts itself to the size of the screen it is displayed on. This is a very important feature, because it makes the very same EPUB file perfectly readable on a panoramic 20 inch desktop screen, on a 10 inch tablet device or on a 3 inch mobile phone.

Basic layout However, this versatility comes with a price. To make the text fit on different screen sizes, an EPUB file allows no pagination or precise page layout. Just like on web pages, the text is only divided in headers and paragraphs that are subsequently displayed within the space given by the screen. This makes EPUB files practically useless for publications that require precise layout and image positioning, such as comics, albums and other rich illustrated books. The actual rendering of the content of an EPUB file also depends on the software used by the reader. The same EPUB e-book on the same device might look quite differently when viewed with

20

two different software solutions. This all means that EPUB creators have only a very limited choice of layout possibilities.

Easy automatization Finally, as emphasized by [Daly 2011], EPUB files are not only user and reader friendly, but also creator and programmer friendly. Given that EPUB files rely on open, simple and well-known web standards, it is relatively easy for developers to create EPUB specific software. Besides, the actual technical process of creating EPUB files can also be automatizated, for example when transforming other text files into EPUB e-books.

CHAPTER TWO E-publishing The first chapter of this paper presents the technical aspects of working with EPUB files. The aim of the second chapter is to put the usage of EPUB files into a somewhat broader economical context, showing the place of EPUB files in the e-publishing domain. Before analyzing the actual role of EPUB files in practical business conditions, it is necessary to begin with an explanation of what e-publishing actually is. According to the online version of the Longman Dictionary of Contemporary English [LDOCE 2011], the noun "publishing" describes the business of producing books and magazines. Still according to [LDOCE 2011], "desktop publishing" is the work of arranging the writing and pictures for a magazine or a book using a computer and special software, whereas "e-publishing" is the business of producing books or magazines that are designed to be read using a computer. These linguistic definitions require further commentary. While it would be difficult to modify the definition of the publishing industry itself, the harsh division between desktop publishing and epublishing is somewhat more problematic. It is definitely true that the first usage of computers in the publishing industry was simply to prepare the text for the later printing process (see [Spring 1991, p. 42, 44]). It is also true that the digital displaying of longer text documents probably evolved from web services and is not directly linked with the software used in the preparation process of printed books. However, modern publishing and text processing software can be used for both purposes, thus rendering the strict distinction between desktop publishing and e-publishing a bit artificial. For this reason and in accordance with [Spring 1991, p. 50], this paper will assume that epublishing covers the whole process of preparing a publication for distribution, that involves 21

computers and computer software, that takes digital input and that produces printed or digital output depending on the will of the publishing house and / or the author.

E-publishing value chain Following this working definition, e-publishing can be seen as a series of steps that begins with some digital text input and ends with a printed or electronic publication being delivered to the final customer. This whole publishing process is made of several consecutive actions that are all aimed at providing the reader with the publication written by the author. At each step of the production process additional value is added to the product, such that the final product can be sold to the customer at a price higher than the sum of the separate production inputs. In economics in general, and in electronic business in particular, such a series of production steps increasing the value of the product is called value chain. The notion of value chain was introduced by Michael Porter and is currently used in a wide variety of markets to describe and to analyze the production process of goods and services (see [Porter 1999, p.49; Hansen / Neumann 2005, p. 614, 615]). The value chain in the publishing market has already been analyzed by [Hansen / Neumann 2005, p. 616] and is depicted in the Figure VII.

22

Figure VII. Publishing value chain according to [Hansen / Neumann 2005, p. 616]. It can clearly be seen that the publishing value chain in the Figure VII only takes into account printed books sold to the reader by a series of intermediaries. In order to adapt this classic vision to the modern e-publishing value chain, the diagram should also include e-books and Internet distribution.

23

Figure VIII. E-publishing value chain. The Figure VIII presents the publishing value chain modified in order to depict the modern epublishing. Instead of only one output, two separate outputs are possible. The author and the publishing house have the possibility to produce a printed version of the book, to produce an electronic version of the book (e-book), or even to produce simultaneously both the printed and the electronic version. Moreover, two new different distribution channels are also available. Whereas the classic bookstores can only sell printed books, online vendors can propose both printed books and e-books. Finally, the publishing house can have its own direct distribution, selling both printed and electronic books directly to final customers. This is a so called hybrid distribution, using both offline and online distribution channels (see [Meier / Stromer 2008, p. 134]).

24

In this modern e-publishing value chain, computers and software are used at virtually all production steps. But the most important function of computer software in e-publishing is to allow the transformation of the author's text input into a final output ready to be printed or to be displayed on user's screen. From this point of view, the data flow in the e-publishing value chain can be divided in three main parts.

Figure IX. Data flow in e-publishing value chain. The Figure IX presents the data flow in the e-publishing value chain as well as the corresponding data formats and / or activities. In the Upstream part of the data flow, the input data from the author and from other sources is collected and stored for further treatment. The input data comes usually in the form of word processing data files or other data that was already published but is reused for a new publication. In the Transformation part of the data flow, the input data is joined, enhanced and finally converted into the desired output. This output is then used in the Downstream data flow part, where the output data is delivered to the printer or directly to the final customer. Depending on the publishing house's choice, the data is transformed into a high quality data file suitable for printing or into a data format that can be directly delivered to e-book readers.

EPUB files in e-publishing value chain According to the presentation of the data flow in the e-publishing value chain, it can clearly be seen that the EPUB file format is specific to the Downstream part of the data treatment process. It means that EPUB files are converted from the input data after the Transformation phase of the process and are then instantly ready to be delivered to final customers. Given the characteristics of EPUB files already presented in chapter one, it is now interesting to see how to best use the potential of EPUB files in this specific context of the e-publishing value chain. In order to explicit the advantages of EPUB files, an in-depth analysis will be presented with regard to production, distribution and pricing.

25

Production Unlike printed books, an EPUB e-book is a so called digital good [Hansen / Neumann 2005, p. 626]. As such, an EPUB file can be copied an unlimited number of times, each copy being perfectly conform to the original. From the production point of view, it means that the publishing house actually produces only one single EPUB file that can later be copied with no limits, at no cost and in very little time. Economically speaking, the production of an additional EPUB e-book has no marginal cost at all: the production cost of ten EPUB copies is equal to the production cost of ten thousand copies. This is in a strong contrast to the production of printed books, where the variable cost rises with the number of printed books and can eventually compromise the possible profit. It is evident that the production planning of EPUB e-books is much simpler than the production planning of printed books, because only the fixed cost of the preparation of the first e-book needs to be taken into account. Moreover, the quantity of books to be printed has to be decided in advance, depending on an uncertain and imperfect prevision of the future demand. This is known as make to stock production, where the whole quantity of books is printed in advance, and is then stocked and subsequently sold to the readers (see [Meier / Stromer 2008, p. 138; Jacobs et al. 209, p. 165]). The make to stock production has a serious disadvantage: the actual demand for the printed books can finally be drastically different than expected, leading to shortages or to stocks of unsold books. In case of EPUB e-books, the problem of production quantity does not even exist. The publishing house actually produces only the first e-book, and the following copies are made on the fly at the very moment readers buy the e-book. But the most impressive advantage of EPUB e-books is their flexibility and versatility when compared to printed books. As a matter of a fact, each publication created as a printed book comes in one single variant. It means that all the customers receive the same content of the publication, the only possible option being the choice between paperback and hardcover. This is due to the fact that paper books have to be prepared and produced entirely in advance, trying to match the quantity with an unpredictable demand. This obviously prevents the publishing house from preparing different content versions of the same book, as it would greatly complicate the production and quantity decisions. As the variable cost and the quantity choice do not apply to EPUB e-books, the EPUB files allow for the creation of multiple variants of the same product without compromising the overall production costs. Coupled with the possible automatization of the creation of EPUB files, this can lead to a highly customizable production process. In fact, it can easily be imagined that several different EPUB files could be made from the same input provided by the Upstream part of the e26

publishing value chain. These versions could offer slightly different content and various functionalities; for example, the simplest version could only offer the plain text of the book, whereas the advanced version could also contain several indexes (names, places, keywords etc.) directly linked to the corresponding text fragments. In this manner, the readers could easily choose the version that best suits their personal needs.

An even further step would be to propose a fully personalized EPUB e-book production on client's demand. These customization possibilities could change the production process from make to stock type to make to order type, where a product can be uniquely designed to suit a single customer's needs (see [Meier / Stromer 2008, p. 138; Jacobs et al. 2009, p. 165]). This could be especially interesting in the case of compilations of already existing texts and publications, uniquely chosen and arranged by the client and possibly automatically indexed and linked within one single EPUB file. This approach could take the e-publishing from mass production of uncustomized printed books to mass customization, giving each reader a unique e-book on his own.

Distribution The classic distribution of printed books relies on a series of intermediaries between the publishing house and the final customer. Each intermediary has its own stock of books that has to be managed and replenished depending on an uncertain future demand. Most of the intermediaries use the push replenish method, i.e. they command a chosen number of books in advance, hoping that this quantity will correspond to the future demand. In this manner, the intermediaries run both the risk of being left with unsold books and the risk of not being able to satisfy the customers’ demand. An alternative solution is the pull replenish method, in which a book is only ordered when it is actually bought by a customer (see [Jacobs et al. 2009, p. 404]). However, this second solution is not perfect as well, because in this case it is the customer that takes the risk of waiting an excessive amount of time for the completion of the order. In contrast, using EPUB files in the e-publishing value chain requires no inventory management at all. One single EPUB file stored by the publishing house can be copied and delivered via Internet to any intermediary or directly to any customer, at any time and at no specific transportation cost. By consequence, the distribution of EPUB files requires no inventory investment and no specific demand forecasting. Moreover, the distribution of EPUB files is independent from geographical and transportation constraints present in the distribution of printed books. EPUB files are downloadable via Internet and can be made available to any market in the world or can be accessed by users from any place, without the need to physically transport a printed exemplary. Finally, the distribution system of EPUB files is extremely flexible, as it allows to take full advantage of modern online distribution. For instance, publishing houses can opt for an own direct distribution by means 27

of disintermediation or can choose indirect distribution and provide their EPUB e-books to online aggregators (see [Meier / Stormer 2008, p. 31, 37]).

Pricing Just like the distribution, the pricing of EPUB e-books has several advantages when compared to traditional printed books. First of all, setting the price of an EPUB e-book is significantly easier because, as stated above, e-books have no marginal cost. By consequence, the price of an EPUB e-book should simply aim to maximize the revenue, without taking into account the variable production cost. This is a welcome simplification in comparison with printed books pricing, where the interaction between the price, the demand and the marginal cost should specifically be taken into account.

Furthermore, the possibility to produce different versions of one book allows the usage of advanced pricing policies known as price differentiation. According to [Phillips 2005; Meyer / Stormer 2008, p. 51], price differentiation consists of setting different prices for different versions of the same product in order to appeal to many customer segments with different price sensitivity and different willingness to pay. This approach has advantages for both publishing houses and e-book buyers. On one hand, the sales revenue increases because additional client segments are now willing to buy more e-books. On the other hand, the customers’ satisfaction also increases because each client can choose the price version that suits his willingness to pay. By consequence, selling different versions of EPUB e-books with different prices is a true win-win situation for publishing houses and book readers.

Comparison with other file formats The previous section has clearly shown that using EPUB e-books in the e-publishing value chain has several important advantages over traditional printed books. But it should be recalled that the EPUB file format is not the only e-book file format that can possibly be used in the e-publishing value chain. Other e-book file formats exist, each one having its own specific advantages and being suitable for specific tasks. In order to complete the overview of the e-publishing value chain, it is thus useful to directly compare EPUB files with other popular file formats. This paper will provide comparison of EPUB files with two file formats, namely PDF and AZW, but the following remarks can easily be extended to other file formats with similar characteristics.

EPUB and PDF PDF, which stands for Portable Document Format, is a file format that was originally created by the Adobe Corporation in 1993 as a proprietary file format. In 2007 the Adobe Corporation decided to make the PDF specification publically available and in 2008 the PDF file format has been

28

published as ISO 32000 specification. The PDF specification and further information on the PDF file format are available online at [Adobe 2011a; Adobe 2011b]. According to [Daly 2011], the main characteristics of the PDF file format is that it is page-oriented and that it is capable of providing very precise layout control. Moreover, the display of PDF files is software and hardware independent, in the sense that a PDF document will look exactly the same on any compatible PDF reader on any hardware device. This is in strong contradiction with the EPUB file format which is reflowable, provides only limited layout possibilities and can be rendered differently on different reader devices. So it is quite obvious that PDF files are far superior in delivering high quality publications that require very precise element spacing or include many image elements that have to be precisely positioned. However, the downturn of the PDF file format is that is quite difficult to read and use on small to medium size screens, like for example smartphones or very small tablet computers. The reason is that the text elements in PDF files are of a fixed size that is relative to the page size of the document and not to the size of the screen of the reader device. The user barely has the difficult choice between viewing the whole page with very small text characters or viewing only a magnified part of the page that has to be moved around all the time. In this context, EPUB files are much more flexible and can be adapted and comfortably viewed on any screen size. Furthermore, PDF files contain no metadata about its content, reason for which they are extremely difficult to convert to any other file formats. On the contrary, EPUB files’ content is structured and described by XML and XHTML tags, thus making it relatively easy to convert automatically.

One final difference is that PDF files, namely because of their precise layout, can be used for high quality professional printing. It means that one single PDF file, once created in the e-publishing value chain, can be used both for printing purposes and for displaying on computer screens. This is an important simplification, as it eliminates the need for creation and conversion of multiple file formats of the same publication. But yet again, this simplification comes with a price. The professional creation of PDF files is only possible with professional grade, expensive software, whereas EPUB files can easily be created with open source tools or even self-made software solutions.

EPUB and AZW AZW is the file format used by Amazon on its Kindle e-book reader. According to [Buchanan 2010], the AZW file format is based on the MOBI file format, originally developed by the Mobipocket company and actually similar to the Open eBook specification (see [Mobipocket 2008] for more technical details). The Mobipocket company was subsequently acquired by Amazon and the MOBI

29

file format was modified and transformed into the AZW file format. The AZW file format is now a proprietary format of Amazon and there is no public specification available. This closed and proprietary character of AZW files is the main difference in comparison with EPUB files. As far as the technical dimension is concerned, both file types must actually have quite a lot in common, as they both stem from the Open eBook specification. In practice, the difference is that EPUB files can be freely created by any interested person or company, whereas AZW files can only be made by Amazon. Moreover, EPUB files can be read on any device with any software that respects its open specification, while AZW files can only be read with hardware or software expressly provided by Amazon (like Kindle reader or Kindle iPhone app). The closed or opened character of e-book files has some obvious implications on the creation and distribution of such e-books. From the author’s point of view, open e-books can be opened and read by any potential reader with any reading software, whereas closed e-books are available only to those that have previously acquired a corresponding dedicated reading device. From the publishing house’s point of view, open e-books can be prepared with any (possibly open-source) software and can be distributed to any intermediary, whereas closed e-books require proprietary software and impose a very limited choice of compatible distributors. From the reader’s point of view, open e-books can be acquired from many independent sources and can freely be transferred between hardware devices or software readers. From the distributor’s point of view, however, closed e-books signify exclusivity of distributed titles and a very strong binding of clients that have already acquired a dedicated reading system.

It appears that open e-book files like EPUB are very much easier to distribute and can reach a much wider population of clients. They are advantageous for authors, publishing houses and clients and give them freedom of choice and independence from any particular commercial hardware or software system. Closed e-book files like AZW, on the contrary, are only advantageous to large online distributors and aggregators. Companies like Amazon can use closed file formats to ensure distribution exclusivity and to bind clients to its own hardware of software e-book readers.

CONCLUSION Main findings The EPUB file format is an open file format developed by an independent organization IDPF. This file format is based on three separate specifications (OPS, OPF and OCF) that, combined, describe the structure and the content of an EPUB file. In fact, an EPUB file is a ZIP file that 30

contains the metadata and the data of an e-book. The metadata is stored in XML files constructed according to specified standards, whereas the actual text data of the e-book is stored in XHTML files. Additional data (like sound or images) can also be included. The EPUB structure is entirely based on open and well known web standards, thus allowing an easy and automated creation of EPUB e-books. Such e-books can be read on any operating system and any hardware device and can adapt to various screen sizes. However, EPUB e-books offer only limited layout possibilities and can be displayed differently on different hardware and software readers. The EPUB files are used in the e-publishing value chain, i.e. in a series of consecutive activities that lead to the creation of a printed or an electronic publication. The modern e-publishing value chain allows a parallel creation of both printed and electronic books that can be distributed through online or offline distribution channels. Due to its flexibility and easy automatization, EPUB files have several important advantages over classical printed books. EPUB e-books are digital goods and have no marginal production cost and require no inventory management. They can easily be produced in different versions with different prices, enabling the use of mass customization and revenue management in order to increase customer satisfaction and sales revenue. But the EPUB file format is not the only e-book file format. Other e-book file formats exist, each one having its own advantages. For instance, the PDF file format is more appropriate for high quality publications with complex layout and graphics, whereas the AZW file format is used by Amazon to increase the loyalty of their clients.

Critical assessment The objective of this paper was to analyze one specific e-book file format and to show its place in the e-publishing value chain. But the e-book phenomenon is a very complex one and cannot be reduced to a single question of file formats. Other important dimensions, such as legal and economic aspects, sociological changes, environmental impact etc. should also be taken into account. All these complex topics should be treated in detail in order to fully apprehend the e-book market. In addition, this paper has only analyzed the current 2.0.1 version of the EPUB specification. A next, 3.0 version of this specification is already being prepared and will probably replace the 2.0.1 specification in a near future. This fact will obviously turn some technical parts of this paper obsolete.

Outlook In order to fully and completely apprehend the e-book phenomenon, a wide range of further specific studies would have to be undertaken. These future studies might for example analyze the 31

Swiss and the European e-book market, legal and technical obstacles for the widespread of ebooks, sociological changes that can lead to a wider acceptance of e-books or even environmental impact of shifting from printed to electronic books. More specifically, this paper should be updated once the new 3.0 EPUB specification will be adopted by IDPF.

32

REFERENCES [AAP 2011] Association of American Publishers: AAP Publishers Report Strong Growth in Year-toYear, Year-End Book Sales, available: http://www.publishers.org/press/24/, accessed 16th May 2011 [Adobe 2011a] Adobe Systems Incorporated: Adobe PDF 101 – Quick overview of PDF file format, available: http://partners.adobe.com/public/developer/tips/topic_tip31.html, accessed 7th May 2011 [Adobe 2011b] Adobe Systems Incorporated: PDF Reference and Adobe Extensions to the PDF Specification, available: http://www.adobe.com/devnet/pdf/pdf_reference.html, accessed 7th May 2011 [Alpeyev / Miller 2011] Alpeyev, Pavel; Miller, Hugo: Android Tablets Gain on Apple IPad in Fourth Quarter,

version

31st

January

2011,

available:

http://www.bloomberg.com/news/2011-01-

31/android-tablets-gain-on-ipad-in-fourth-quarter-researcher-says.html, accessed 26th Mars 2011 [Buchanan 2010] Buchanan, Matt: Giz Explains: How You’re Gonna Get Screwed By Ebook Formats, version 10th March 2010, available: http://gizmodo.com/#!5478842/giz-explains-howyoure-gonna-get-screwed-by-ebook-formats, accessed 7th May 2011 [Carnoy 2010a] Carnoy, David: What Amazon didn’t say about e-books, version 20th July 2010, available: http://reviews.cnet.com/8301-18438_7-20011038-82.html, accessed 26th Mars 2011 [Carnoy 2010b] Carnoy, David: Amazon: we have 70-80 percent of e-book market, version 2nd August 2010, available: http://reviews.cnet.com/8301-18438_7-20012381-82.html, accessed 26th Mars 2011 [Carnoy 2011] Carnoy, David: B&N: Nook has 25 percent of U.S. e-book market, version 23rd February 2011, available: http://news.cnet.com/8301-17938_105-20035277-1.html, accessed 26th Mars 2011 [DAISY 2011] DAISY Consortium, available: http://www.daisy.org, accessed 9th April 2011 [Daly 2011] Daly, Liza: Build a digital book with EPUB, version 11th January 2011, available: http://www.ibm.com/developerworks/xml/tutorials/x-epubtut/index.html, accessed 16th Mars 2011 [Dublin Core 2011] Dublin CoreÒ Metadata Initiative, available: http://dublincore.org, accessed 6th April 2011 [Gutenberg 2011] Project Gutenberg, available: http://www.gutenberg.org, accessed 16th Mars 2011 [Hamblen 2010a] Hamblen, Matt: Google launches eBooks, eBookstore, version 6th December 2010,

available:

http://www.computerworld.com/s/article/9199599

/Google_launches_eBooks_eBookstore, accessed 26th Mars 2011 [Hamblen 2010b] Hamblen, Matt: Hot e-readers sales will continue into 2011, Gartner says, version

8th

December

2010,

available:

http://www.computerworld.com/s/article

/9200525/Hot_e_reader_sales_will_continue_into_2011_Gartner_says, accessed 26th Mars 2011 33

[Hansen / Neumann 2005] Hansen, Hans Robert; Neumann, Gustaf: Wirtschaftsinformatik, 9. Auflage, Lucius & Lucius, Suttgart, 2005 [Hendrickson 2011] Hendrickson, Mike: 2010 State of the Computer Book Market, Post 5 – WrapUp and Digital, version 23rd February 2011, available: http://radar.oreilly.com/print/2011/02/2010book-market-5.html, accessed 26th Mars 2011 [Jacobs et al. 2009] Jacobs, F.Robert, Chase, Richard B., Aquilano Nicholas J.: Operations & supply management, 12th edition, McGraw-Hill, Boston et al. 2009 [LDOCE

2011]

Longman:

Longman

Dictionary

of

Contemporary

English,

available:

http://www.ldoceonline.com, accessed 22nd April 2011 [Lebert 2009] Lebert, Marie: A Short Histroy of eBooks, version 2009, available: http://www.etudesfrancaises.net/dossiers/ebook.htm, accessed 26th Mars 2011 [Lebert 2008] Lebert, Marie: Technology and Books for All, version 2008, available: http://www.etudes-francaises.net/dossiers/technologies.htm, accessed 26th Mars 2011 [Lebert 2007] Lebert, Marie: Les mutations du livre à l’heure numérique, version September 2007, available: http://www.etudes-francaises.net/dossiers/mutations.htm, accessed 26th Mars 2011 [Meier / Stromer 2008] Meier, Andreas; Stromer, Henrik: eBusiness & eCommerce, 2nd edition, Springer, Berlin Heidelberg 2008 [Mobipocket 2008] Mobipocket: Welcome to Mobipocket Developer Center, version 24th April 2008, available:

http://www.mobipocket.com/dev/article.asp?BaseFolder=prcgen&File=mobiformat.htm,

accessed 7th May 2011 [NISO 2011] NISO: The DAISY Standard, available: http://www.niso.org/workrooms/daisy/, accessed 9th April 2011 [OASIS 2011] OASIS: Open Document Format for Office Applications, available: http://www.oasisopen.org/committees/tc_home.php?wg_abbrev=office, accessed 3rd April 2011 [Phillips 2005] Phillips, Robert L.: Pricing and revenue optimization, Stanford University Press, Stanford 2005 [PKWARE 2011] PKWARE: Our Founder – Phil Katz, available: http://www.pkware.com/aboutus/phil-katz, accessed 11th May 2011 [Porter 1999] Porter, Michael E.: L'avantage concurrentiel, Dunod, Paris 1999 [Sheldon 2001] Sheldon, Tom: McGraw-Hill Encyclopedia of networking & telecommunications, McGraw-Hill, New York et al. 2001 [Spring 1991] Spring, Michael B.: Electronic Printing and Publishing, Marcel Dekkel, Inc., New York et al. 1991 [W3C 2011a] World Wide Web Consortium: XHTML™ 1.1 – Module-based XHTML – Second Edition, version 23 November 2010, available: http://www.w3.org/TR/xhtml11/, accessed 11th April 2011

34

[W3C 2011b] World Wide Web Consortium: Cascading Style Sheets Level 2 Revision 1 (CSS 2.1) Specification, version 7 December 2010, available: http://www.w3.org/TR/CSS2/, accessed 11th April 2011 [W3C 2011c] World Wide Web Consortium: Scalable Vector Graphics (SVG), available: http://www.w3.org/Graphics/SVG/, accessed 11th April 2011 [W3C 2011d] World Wide Web Consortium: Extensible Markup Language (XML), available: http://www.w3.org/XML/, accessed 11th May 2011 [W3C 2011e] World Wide Web Consortium: Mathematical Markup Language (MathML) Version 3.0, version 21 October 2010, available: http://www.w3.org/TR/MathML/, accessed 16th April 2011 [Winograd 2010] Winograd, David: The iBookstore six months after launch: One big failure, version 14th October 2010, available: http://www.tuaw.com/2010/10/14/the-ibookstore-six-months-afterlaunch-one-big-failure/, accessed 26th Mars 2011

35