Shallow Thoughts : tags : epub

Akkana's Musings on Open Source Computing and Technology, Science, and Nature.

Tue, 15 Sep 2015

Hacking / Customizing a Kobo Touch ebook reader: Part II, Python

I wrote last week about tweaking a Kobo e-reader's sqlite database by hand.

But who wants to remember all the table names and type out those queries? I sure don't. So I wrote a Python wrapper that makes it much easier to interact with the Kobo databases.

Happily, Python already has a module called sqlite3. So all I had to do was come up with an API that included the calls I typically wanted -- list all the books, list all the shelves, figure out which books are on which shelves, and so forth.

The result was kobo_utils.py, which includes a main function that can list books, shelves, or shelf contents.

You can initialize kobo_utils like this:

import kobo_utils

koboDB = KoboDB("/path/where/your/kobo/is/mounted")
koboDB.connect("/path/to/KoboReader.sqlite")
connect() throws an exception if it can't find the .sqlite file.

Then you can list books thusly:

koboDB.list_books()
or list shelf names:
koboDB.list_shelves()
or use print_shelf which books are on which shelves:
shelves = koboDB.get_dlist("Shelf", selectors=[ "Name" ])
for shelf in shelves:
    print shelf["Name"]

What I really wanted, though, was a way to organize my library, taking the tags in each of my epub books and assigning them to an appropriate shelf on the Kobo, creating new shelves as needed. Using kobo_utils.py plus the Python epub library I'd already written, that ended up being quite straightforward: shelves_by_tag.

Tags: , , , , ,
[ 20:38 Sep 15, 2015    More tech | permalink to this entry | ]

Thu, 03 Sep 2015

Hacking / Customizing a Kobo Touch ebook reader: Part I, sqlite

I've been enjoying reading my new Kobo Touch quite a lot. The screen is crisp, clear and quite a bit whiter than my old Nook; the form factor is great, it's reasonably responsive (though there are a few places on the screen where I have to tap harder than other places to get it to turn the page), and I'm happy with the choice of fonts.

But as I mentioned in my previous Kobo article, there were a few tweaks I wanted to make; and I was very happy with how easy it was to tweak, compared to the Nook. Here's how.

Mount the Kobo

When you plug the Kobo in to USB, it automatically shows up as a USB-Storage device once you tap "Connect" on the Kobo -- or as two storage devices, if you have an SD card inserted.

Like the Nook, the Kobo's storage devices show up without partitions. For instance, on Linux, they might be /dev/sdb and /dev/sdc, rather than /dev/sdb1 and /dev/sdc1. That means they also don't present UUIDs until after they're already mounted, so it's hard to make an entry for them in /etc/fstab if you're the sort of dinosaur (like I am) who prefers that to automounters.

Instead, you can use the entry in /dev/disk/by-id. So fstab entries, if you're inclined to make them, might look like:

/dev/disk/by-id/usb-Kobo_eReader-3.16.0_N905K138254971:0 /kobo   vfat user,noauto,exec,fmask=133,shortname=lower 0 0
/dev/disk/by-id/usb-Kobo_eReader-3.16.0_N905K138254971:1 /kobosd vfat user,noauto,exec,fmask=133,shortname=lower 0 0

One other complication, for me, was that the Kobo is one of a few devices that don't work through my USB2 powered hub. Initially I thought the Kobo wasn't working, until I tried a cable plugged directly into my computer. I have no idea what controls which devices work through the hub and which ones don't. (The Kobo also doesn't give any indication when it's plugged in to a wall charger, nor does

The sqlite database

Once the Kobo is mouted, ls -a will show a directory named .kobo. That's where all the good stuff is: in particular, KoboReader.sqlite, the device's database, and Kobo/Kobo eReader.conf, a human-readable configuration file.

Browse through Kobo/Kobo eReader.conf for your own amusement, but the remainder of this article will be about KoboReader.sqlite.

I hadn't used sqlite before, and I'm certainly no SQL expert. But a little web searching and experimentation taught me what I needed to know.

First, make a local copy of KoboReader.sqlite, so you don't risk overwriting something important during your experimentation. The Kobo is apparently good at regenerating data it needs, but you might lose information on books you're reading.

To explore the database manually, run: sqlite3 KoboReader.sqlite

Some useful queries

Here are some useful sqlite commands, which you can generalize to whatever you want to search for on your own Kobo. Every query (not .tables) must end with a semicolon.

Show all tables in the database:

.tables
The most important ones, at least to me, are content (all your books), Shelf (a list of your shelves/collections), and ShelfContent (the table that assigns books to shelves).

Show all column names in a table:

PRAGMA table_info(content);
There are a lot of columns in content, so try PRAGMA table_info(content); to see a much simpler table.

Show the names of all your shelves/collections:

SELECT Name FROM Shelf;

Show everything in a table:

SELECT * FROM Shelf;

Show all books assigned to shelves, and which shelves they're on:

SELECT ShelfName,ContentId FROM ShelfContent;
ContentId can be a URL to a sideloaded book, like file:///mnt/sd/TheWitchesOfKarres.epub, or a UUID like de98dbf6-e798-4de2-91fc-4be2723d952f for books from the Kobo store.

Show all books you have installed:

SELECT Title,Attribution,ContentID FROM content WHERE BookTitle is null ORDER BY Title;
One peculiarity of Kobo's database: each book has lots of entries, apparently one for each chapter. The entries for chapters have the chapter name as Title, and the book title as BookTitle. The entry for the book as a whole has BookTitle empty, and the book title as Title. For example, I have file:///mnt/sd/earnest.epub sideloaded:
sqlite> SELECT Title,BookTitle from content WHERE ContentID LIKE "%hamlet%";
HAMLET, PRINCE OF DENMARK|Hamlet
PERSONS REPRESENTED.|Hamlet
ACT I.|Hamlet
Scene II. Elsinore. A room of state in the Castle.|Hamlet
Scene III. A room in Polonius's house.|Hamlet
Scene IV. The platform.|Hamlet
Scene V. A more remote part of the Castle.|Hamlet
Act II.|Hamlet
  [ ... and so on ... ]
ACT V.|Hamlet
Scene II. A hall in the Castle.|Hamlet
Hamlet|
Each of these entries has Title set to the name of the chapter (an act in the play) and BookTitle set to Hamlet, except for the final entry, which has Title set to Hamlet and BookTitle set to nothing. That's why you need that query WHERE BookTitle is null if you just want a list of your books.

Show all books by an author:

SELECT Title,Attribution,ContentID FROM content WHERE BookTitle is null
AND Attribution LIKE "%twain%" ORDER BY Title;
Attribution is where the author's name goes. LIKE %% searches are case insensitive.

Update: how to change a field

I realized I didn't include how to change a field, and that seems to fit better with this article rather than writing a whole new blog post just for that.

The Kobo doesn't show series order. So to find a specific book, and then update its title to include the series and series number, do something like this:

SELECT Title,Attribution,ContentID,BookTitle FROM content WHERE Attribution LIKE "%Doyle%" AND Title LIKE "%Adventures%";
UPDATE content SET Title="Sherlock Stories 1: The Adventures of Sherlock Holmes" WHERE Attribution LIKE "%Doyle%" AND Title LIKE "%Adventures%";
    

To delete an entry -- in this case I had two copies of the same book and needed to specify the ContentID of the older one:

DELETE from content WHERE Attribution LIKE "%Doyle%" AND ContentID="file:///mnt/sd/memoirs-holmes.epub";
    

Of course, it's a lot handier to have a program that knows these queries so you don't have to type them in every time (especially since the sqlite3 app has no history or proper command-line editing). But this has gotten long enough, so I'll write about that separately.

Tags: , , , , ,
[ 19:11 Sep 03, 2015    More tech | permalink to this entry | ]

Wed, 26 Aug 2015

Switching to a Kobo e-reader

For several years I've kept a rooted Nook Touch for reading ebooks. But recently it's become tough to use. Newer epub books no longer work work on any version of FBReader still available for the Nook's ancient Android 2.1, and the Nook's built-in reader has some fatal flaws: most notably that there's no way to browse books by subject tag, and it's painfully slow to navigate a library of 250 books when have to start from the As and you need to get to T paging slowly forward 6 books at a time.

The Kobo Touch

But with my Nook unusable, I borrowed Dave's Kobo Touch to see how it compared. I like the hardware: same screen size as the Nook, but a little brighter and sharper, with a smaller bezel around it, and a spring-loaded power button in a place where it won't get pressed accidentally when it's packed in a suitcase -- the Nook was always coming on while in its case, and I didn't find out until I pulled it out to read before bed and discovered the battery was too low.

The Kobo worked quite nicely as a reader, though it had a few of the same problems as the Nook. They both insist on justifying both left and right margins (Kobo has a preference for that, but it doesn't work in any book I tried). More important is the lack of subject tags. The Kobo has a "Shelves" option, called "Collections" in some versions, but adding books to shelves manually is tedious if you have a lot of books. (But see below.)

It also shared another Nook problem: it shows overall progress in the book, but not how far you are from the next chapter break. There's a choice to show either book progress or chapter progress, but not both; and chapter progress only works for books in Kobo's special "kepub" format (I'll write separately about that). I miss FBReader's progress bar that shows both book and chapter progress, and I can't fathom why that's not considered a necessary feature for any e-reader.

But mostly, Kobo's reader was better than the Nook's. Bookmarks weren't perfect, but they basically worked, and I didn't even have to spent half an hour reading the manual to use them (like I did with the Nook). The font selection was great, and the library navigation had one great advantage over the Nook: a slider so you could go from A to T quickly.

I liked the Kobo a lot, and promptly ordered one of my own.

It's not all perfect

There were a few disadvantages. Although the Kobo had a lot more granularity in its line spacing and margin settings, the smallest settings were still a lot less tight than I wanted. The Nook only offered a few settings but the smallest setting was pretty good.

Also, the Kobo can only see books at the top level of its microSD card. No subdirectories, which means that I can't use a program like rsync to keep the Kobo in sync with my ebooks directory on my computer. Not that big a deal, just a minor annoyance.

More important was the subject tagging, which is really needed in a big library. It was pretty clear Shelves/Collections were what I needed; but how could I get all my books into shelves without laboriously adding them all one by one on a slow e-ink screen?

It turns out Kobo's architecture makes it pretty easy to fix these problems.

Customizing Kobo

While the rooted Nook community has been stagnant for years -- it was a cute proof of concept that, in the end, no one cared about enough to try to maintain it -- Kobo readers are a lot easier to hack, and there's a thriving Kobo community on MobileReads which has been trading tips and patches over the years -- apparently with Kobo's blessing.

The biggest key to Kobo's customizability is that you can mount it as a USB storage device, and one of the files that exposes is the device's database (an sqlite file). That means that well supported programs like Calibre can update shelves/collections on a Kobo, access its book list, and other nifty tricks; and if you want more, you can write your own scripts, or even access the database by hand.

I'll write separately about some Python scripts I've written to display the database and add books to shelves, and I'll just say here that the process was remarkably straightforward and much easier than I usually expect when learning to access a new device.

There's lots of other customizing you can do. There are ways of installing alternative readers on the Kobo, or installing Python so you can write your own reader. I expected to want that, but so far the built-in reader seems good enough.

You can also patch the OS. Kobo updates are distributed as tarballs of binaries, and there's a very well designed, documented and supported (by users, not by Kobo) patching script distributed on MobileReads for each new Kobo release. I applied a few patches and was impressed by how easy it was. And now I have tight line spacing and margins, a slightly changed page number display at the bottom of the screen (still only chapter or book, not both), and a search that defaults to my local book collection rather than the Kobo store.

Stores and DRM

Oh, about the Kobo store. I haven't tried it yet, so I can't report on that. From what I read, it's pretty good as e-bookstores go, and a lot of Nook and Sony users apparently prefer to buy from Kobo. But like most e-bookstores, the Kobo store uses DRM, which makes it a pain (and is why I probably won't be using it much).

They use Adobe's DRM, and at least Adobe's Digital Editions app works in Wine under Linux. Amazon's app no longer does, and in case you're wondering why I didn't consider a Kindle, that's part of it. Amazon has a bad reputation for removing rights to previously purchased ebooks (as well as for spying on their customers' reading habits), and I've experienced it personally more than once.

Not only can I no longer use the Kindle app under Wine, but Amazon no longer lets me re-download the few Kindle books I've purchased in the past. I remember when my mother used to use the Kindle app on Android regularly; every few weeks all her books would disappear and she'd have to get on the phone again to Amazon to beg to have them back. It just isn't worth the hassle. Besides, Kindles can't read public library books (those are mostly EPUBs with Adobe DRM); and a Kindle would require converting my whole EPUB library to MOBI. I don't see any up side, and a lot of down side.

The Adobe scheme used by Kobo and Nook is better, but I still plan to avoid books with DRM as much as possible. It's not the stores' fault, and I hope Kobo does well, because they look like a good company. It's the publishers who insist on DRM. We can only hope that some day they come to their senses, like music publishers finally did with MP3 versus DRMed music. A few publishers have dropped DRM already, and if we readers avoid buying DRMed ebooks, maybe the message will eventually get through.

Tags: , , , ,
[ 17:04 Aug 26, 2015    More tech | permalink to this entry | ]

Thu, 20 Aug 2015

Python module for reading EPUB e-book metadata

Three years ago I wanted a way to manage tags on e-books in a lightweight way, without having to maintain a Calibre database and fire up the Calibre GUI app every time I wanted to check a book's tags. I couldn't find anything, nor did I find any relevant Python libraries, so I reverse engineered the (simple, XML-bsaed) EPUB format and wrote a Python script to show or modify epub tags.

I've been using that script ever since. It's great for Project Gutenberg books, which tend to be overloaded with tags that I don't find very useful for categorizing books ("United States -- Social life and customs -- 20th century -- Fiction") but lacking in tags that I would find useful ("History", "Science Fiction", "Mystery").

But it wasn't easy to include it in other programs. For the last week or so I've been fiddling with a Kobo ebook reader, and I wanted to write programs that could read epub and also speak Kobo-ese. (I'll write separately about the joys of Kobo hacking. It's really a neat little e-reader.)

So I've factored my epubtag script into a usable Python module, so as well as being a standalone program for viewing epub book data, it's easy to use from other programs. It's available on GitHub: epubtag.py: parse EPUB metadata and view or change subject tags.

Tags: , , , ,
[ 20:27 Aug 20, 2015    More programming | permalink to this entry | ]