20. december: Creating an offline e-book reader using HTML 5
I recently did a proof-of-concept-project for one of our
clients where they wanted to look into the possibilities of using an HTML 5-based web application for reading e-books. Our client currently provides e-books for their users, but in order for the users to read an e-book, they have to register, download and install a third party application. By using an HTML 5 web application, users only need a browser that supports HTML 5 offline features to read e-books. To provide an alternative to their current setup, the application had to address the following requirements:
- Offline availability of e-books
- Copy protection of contents
- Automatic expiration of e-books after a given amount of time
An e-book reader platform using HTML 5 and Monocle that supports offline reading of e-books already exists at Bookish. Like Bookish we decided to use the Monocle Reader to provide the basic e-book reader capabilities.
This application however also needs to support copy protection as well as time expiration of e-books.
Using HTML 5 features such as Application cache and localStorage it is possible to make an e-book available to users when they are offline. In our application the contents of the e-book is stored in the browser’s local storage when the user accesses the e-book for the first time. At this point we also store the web page containing the reader as well as other external resources such as images and style sheets in the browser’s application cache. When this is done the user is able to browse the stored content even when offline.
When making an e-book available offline in this way, protection of contents becomes an issue. To prevent the direct copying of content, we added encryption to the e-book components.
Encryption of EPUB content
The e-book content of an EPUB file consists of one or more HMTL files. When the user requests an e-book, we encrypt the contents of each component on the server before sending them back and storing them on the user’s device. In this project we used a simple encryption with a BASE64 encoding on top, but the encryption can be as simple or advanced as you want. By encrypting the content, the content is not readable unless it is read by using the e-book reader application. The following screen shot shows the encrypted stored contents by using Chrome’s developer tools:
The reader application is responsible for decrypting the contents. The decryption is done on page level, which means that the content is encrypted until the user turns a page. By turning the page the content for that page is decrypted and shown in the reader. Our tests show that this has near-zero impact on user experience.
Prevention of Copy / Paste
To prevent direct copy / paste of content, a layer was applied to the e-book reader application. This prevents the user from highlighting and manually copying text from the page. However, it is still possible for a user to copy / paste the contents for the current page. This requires that the user uses the browser’s developer tools and copies the decrypted text from there. This way of copying a text could be equated to using a copying machine to copy a page from a paper book.
Time expiration management
Another requirement that had to be addressed was to manage expiration of the e-book after a given amount of time. This is accomplished by storing a cookie set to expire on the expiration date on the user’s device as well as continuously validating the expiration date using code.
To prevent manipulation with the time period the following measures were taken: The key to decrypt the contents contains the start and end timestamps, which means that if the timestamps are modified by the user, the contents can no longer be decrypted. Also whenever a page is turned, the current time is validated against the start and end time and the time for the last page turn. If this validation fails, the contents are deleted from the user’s device. This way if a user tries to manipulate with the device’s clock, the contents of the e-book will be deleted from the device.
Being a web application that supports offline availability introduces several issues that cannot be solved currently. First of all, all files are stored on the user’s device: content, script and decryption key. This means that a person who really wants to decrypt a single e-book can start analyzing the files saved on the device and create a script or application to decrypt the e-book contents. To make this difficult for such a person, several precautions can be taken: The greatest advantage of using an HTML 5 based web application, as I see it, is the flexible nature of the web application. For instance If we want to, we can have an array of different encryption methods, places we store the decryption key or script libraries we use, where we can choose a new one at random each time a user retrieves an e-book. This way it becomes nearly impossible to create a generic decryption application that applies to all retrieved e-books.
Of course other issues currently include browser support. Internet Explorer is the main concern when talking browser support for offline web applications. A solution for Internet Explorer users could be to use the Chrome Frame plugin. This approach is implemented by Bookish.
Having used HTML 5 and Monocle to complete the POC-project, we are very positive about using a web application as an alternative to installed applications when it comes to creating an e-book reader application.