Shallow Thoughts : tags : epub
Akkana's Musings on Open Source Computing and Technology, Science, and Nature.
Tue, 15 Sep 2015
I wrote last week about
tweaking a
Kobo e-reader's sqlite database by hand.
But who wants to remember all the table names and type out those queries?
I sure don't. So I wrote a Python wrapper that makes it much easier to
interact with the Kobo databases.
Happily, Python already has a module called sqlite3.
So all I had to do was come up with an API that included the calls
I typically wanted -- list all the books, list all the shelves,
figure out which books are on which shelves, and so forth.
The result was
kobo_utils.py,
which includes a main function that can list books, shelves, or shelf contents.
You can initialize kobo_utils like this:
import kobo_utils
koboDB = KoboDB("/path/where/your/kobo/is/mounted")
koboDB.connect("/path/to/KoboReader.sqlite")
connect()
throws an exception if it can't find the
.sqlite file.
Then you can list books thusly:
koboDB.list_books()
or list shelf names:
koboDB.list_shelves()
or use
print_shelf
which books are on which shelves:
shelves = koboDB.get_dlist("Shelf", selectors=[ "Name" ])
for shelf in shelves:
print shelf["Name"]
What I really wanted, though, was a way to organize my library,
taking the tags in each of my epub books and assigning them to
an appropriate shelf on the Kobo, creating new shelves as needed.
Using kobo_utils.py plus the
Python
epub library I'd already written, that ended up being quite
straightforward:
shelves_by_tag.
Tags: ebook, kobo, epub, database, sql, sqlite
[
20:38 Sep 15, 2015
More tech |
permalink to this entry |
]
Thu, 03 Sep 2015
I've been enjoying reading my new Kobo Touch quite a lot. The screen
is crisp, clear and quite a bit whiter than my old Nook;
the form factor is great, it's reasonably responsive (though there
are a few places on the screen where I have to tap harder than other
places to get it to turn the page), and I'm happy with the choice of fonts.
But as I mentioned in my
previous Kobo article,
there were a few tweaks I wanted to make; and I was very happy with how
easy it was to tweak, compared to the Nook. Here's how.
Mount the Kobo
When you plug the Kobo in to USB, it automatically shows up as a
USB-Storage device once you tap "Connect" on the Kobo -- or as two
storage devices, if you have an SD card inserted.
Like the Nook, the Kobo's storage devices show up without partitions.
For instance, on Linux, they might be /dev/sdb and /dev/sdc, rather
than /dev/sdb1 and /dev/sdc1. That means they also don't present UUIDs
until after they're already mounted, so it's hard to make an entry for
them in /etc/fstab if you're the sort of dinosaur (like I am) who prefers
that to automounters.
Instead, you can use the entry in /dev/disk/by-id.
So fstab entries, if you're inclined to make them, might look like:
/dev/disk/by-id/usb-Kobo_eReader-3.16.0_N905K138254971:0 /kobo vfat user,noauto,exec,fmask=133,shortname=lower 0 0
/dev/disk/by-id/usb-Kobo_eReader-3.16.0_N905K138254971:1 /kobosd vfat user,noauto,exec,fmask=133,shortname=lower 0 0
One other complication, for me, was that the Kobo is one of a few
devices that don't work through my USB2 powered hub. Initially I
thought the Kobo wasn't working, until I tried a cable plugged
directly into my computer. I have no idea what controls which devices
work through the hub and which ones don't.
(The Kobo also doesn't give any indication when it's plugged in to a
wall charger, nor does
The sqlite database
Once the Kobo is mouted, ls -a
will show a directory
named .kobo. That's where all the good stuff is:
in particular, KoboReader.sqlite, the device's database,
and Kobo/Kobo eReader.conf, a human-readable configuration file.
Browse through Kobo/Kobo eReader.conf for your own amusement,
but the remainder of this article will be about KoboReader.sqlite.
I hadn't used sqlite before, and I'm certainly no SQL expert. But a
little web searching and experimentation taught me what I needed to know.
First, make a local copy of KoboReader.sqlite, so you don't risk
overwriting something important during your experimentation.
The Kobo is apparently good at regenerating data it needs, but
you might lose information on books you're reading.
To explore the database manually, run:
sqlite3 KoboReader.sqlite
Some useful queries
Here are some useful sqlite commands, which you can generalize to
whatever you want to search for on your own Kobo. Every query (not .tables)
must end with a semicolon.
Show all tables in the database:
.tables
The most important ones, at least to me, are content (all your books),
Shelf (a list of your shelves/collections), and ShelfContent
(the table that assigns books to shelves).
Show all column names in a table:
PRAGMA table_info(content);
There are a lot of columns in
content, so try
PRAGMA
table_info(content);
to see a much simpler table.
Show the names of all your shelves/collections:
SELECT Name FROM Shelf;
Show everything in a table:
SELECT * FROM Shelf;
Show all books assigned to shelves, and which shelves they're on:
SELECT ShelfName,ContentId FROM ShelfContent;
ContentId can be a URL to a sideloaded book, like
file:///mnt/sd/TheWitchesOfKarres.epub, or a UUID like
de98dbf6-e798-4de2-91fc-4be2723d952f for books from the Kobo store.
Show all books you have installed:
SELECT Title,Attribution,ContentID FROM content WHERE BookTitle is null ORDER BY Title;
One peculiarity of Kobo's database: each book has lots of entries,
apparently one for each chapter. The entries for chapters have the
chapter name as Title, and the book title as BookTitle. The entry
for the book as a whole has BookTitle empty, and the book title as Title.
For example, I have
file:///mnt/sd/earnest.epub sideloaded:
sqlite> SELECT Title,BookTitle from content WHERE ContentID LIKE "%hamlet%";
HAMLET, PRINCE OF DENMARK|Hamlet
PERSONS REPRESENTED.|Hamlet
ACT I.|Hamlet
Scene II. Elsinore. A room of state in the Castle.|Hamlet
Scene III. A room in Polonius's house.|Hamlet
Scene IV. The platform.|Hamlet
Scene V. A more remote part of the Castle.|Hamlet
Act II.|Hamlet
[ ... and so on ... ]
ACT V.|Hamlet
Scene II. A hall in the Castle.|Hamlet
Hamlet|
Each of these entries has Title set to the name of the chapter (an act
in the play) and BookTitle set to
Hamlet, except for the final
entry, which has Title set to
Hamlet and BookTitle set to nothing.
That's why you need that query
WHERE BookTitle is null if you
just want a list of your books.
Show all books by an author:
SELECT Title,Attribution,ContentID FROM content WHERE BookTitle is null
AND Attribution LIKE "%twain%" ORDER BY Title;
Attribution is where the author's name goes. LIKE %% searches
are case insensitive.
Update: how to change a field
I realized I didn't include how to change a field, and
that seems to fit better with this article rather than writing
a whole new blog post just for that.
The Kobo doesn't show series order. So to find a specific book, and
then update its title to include
the series and series number, do something like this:
SELECT Title,Attribution,ContentID,BookTitle FROM content WHERE Attribution LIKE "%Doyle%" AND Title LIKE "%Adventures%";
UPDATE content SET Title="Sherlock Stories 1: The Adventures of Sherlock Holmes" WHERE Attribution LIKE "%Doyle%" AND Title LIKE "%Adventures%";
To delete an entry -- in this case I had two copies of the same book
and needed to specify the ContentID of the older one:
DELETE from content WHERE Attribution LIKE "%Doyle%" AND ContentID="file:///mnt/sd/memoirs-holmes.epub";
Of course, it's a lot handier to have a program that knows these queries
so you don't have to type them in every time (especially since the sqlite3
app has no history or proper command-line editing).
But this has gotten long enough, so I'll write about that separately.
Tags: ebook, kobo, epub, database, sql, sqlite
[
19:11 Sep 03, 2015
More tech |
permalink to this entry |
]
Wed, 26 Aug 2015
For several years I've kept a rooted Nook Touch for reading ebooks.
But recently it's become tough to use. Newer epub books no longer work
work on any version of FBReader still available for the Nook's ancient
Android 2.1, and the Nook's built-in reader has some fatal flaws: most
notably that there's no way to browse books by subject tag, and it's
painfully slow to navigate a library of 250 books when have to start
from the As and you need to get to T paging slowly
forward 6 books at a time.
The Kobo Touch
But with my Nook unusable, I borrowed Dave's Kobo Touch to see how
it compared. I like the hardware: same screen size as the Nook, but a
little brighter and sharper, with a smaller bezel around it, and
a spring-loaded power button in a place where it won't get pressed
accidentally when it's packed in a suitcase -- the Nook was always
coming on while in its case, and I didn't find out until I pulled it
out to read before bed and discovered the battery was too low.
The Kobo worked quite nicely as a reader, though it had a few of the
same problems as the Nook. They both insist on justifying both left
and right margins (Kobo has a preference for that, but it doesn't work
in any book I tried). More important is the lack of subject tags. The
Kobo has a "Shelves" option, called "Collections" in some versions,
but adding books to shelves manually is tedious if you have a lot of
books. (But see below.)
It also shared another Nook problem: it shows overall progress in the
book, but not how far you are from the next chapter break. There's
a choice to show either book progress or chapter progress,
but not both; and chapter progress only works for books in Kobo's
special "kepub" format (I'll write separately about that).
I miss FBReader's progress bar that shows both book and chapter progress,
and I can't fathom why that's not considered a necessary feature for
any e-reader.
But mostly, Kobo's reader was better than the Nook's.
Bookmarks weren't perfect, but they basically worked, and I
didn't even have to spent half an hour reading the manual to use them
(like I did with the Nook). The font selection was great, and the
library navigation had one great advantage over the Nook: a slider
so you could go from A to T quickly.
I liked the Kobo a lot, and promptly ordered one of my own.
It's not all perfect
There were a few disadvantages. Although the Kobo had a lot more
granularity in its line spacing and margin settings, the smallest
settings were still a lot less tight than I wanted. The Nook only
offered a few settings but the smallest setting was pretty good.
Also, the Kobo can only see books at the top level of its microSD
card. No subdirectories, which means that I can't use a program like
rsync to keep the Kobo in sync with my ebooks directory on my computer.
Not that big a deal, just a minor annoyance.
More important was the subject tagging, which is really needed in
a big library. It was pretty clear Shelves/Collections were what I
needed; but how could I get all my books into shelves without
laboriously adding them all one by one on a slow e-ink screen?
It turns out Kobo's architecture makes it pretty easy to fix these problems.
Customizing Kobo
While the rooted Nook community has been stagnant for years --
it was a cute proof of concept that, in the end, no one cared about
enough to try to maintain it -- Kobo readers are a lot easier to
hack, and there's a thriving
Kobo
community on MobileReads which has been trading tips and patches
over the years -- apparently with Kobo's blessing.
The biggest key to Kobo's customizability is that you can mount it as
a USB storage device, and one of the files that exposes is the
device's database (an sqlite file). That means that well supported
programs like Calibre can update shelves/collections on a Kobo, access
its book list, and other nifty tricks; and if you want more, you can
write your own scripts, or even access the database by hand.
I'll write separately about some Python scripts I've written to
display the database and add books to shelves, and I'll just say here
that the process was remarkably straightforward and much easier than
I usually expect when learning to access a new device.
There's lots of other customizing you can do.
There are ways of installing alternative readers on the Kobo, or installing
Python so you can write your own reader. I expected to want that,
but so far the built-in reader seems good enough.
You can also patch the OS. Kobo updates are distributed as tarballs of
binaries, and there's a very well designed, documented and supported
(by users, not by Kobo) patching script distributed on MobileReads for
each new Kobo release. I applied a few patches and was impressed by
how easy it was. And now I have tight line spacing and margins, a
slightly changed page number display at the bottom of the screen
(still only chapter or book, not both), and a search that defaults to
my local book collection rather than the Kobo store.
Stores and DRM
Oh, about the Kobo store. I haven't tried it yet, so I can't report
on that. From what I read, it's pretty good as e-bookstores go,
and a lot of Nook and Sony users apparently prefer to buy from Kobo.
But like most e-bookstores, the Kobo store uses DRM, which makes
it a pain (and is why I probably won't be using it much).
They use Adobe's DRM, and at least Adobe's Digital Editions app works
in Wine under Linux. Amazon's app no longer does, and in case you're
wondering why I didn't consider a Kindle, that's part of it.
Amazon has a bad reputation for
removing rights to previously purchased ebooks
(as well as for spying on their customers' reading habits),
and I've experienced it personally more than once.
Not only can I no longer use the Kindle app under Wine, but Amazon no
longer lets me re-download the few Kindle books I've purchased in the
past. I remember when my mother used to use the Kindle app on Android
regularly; every few weeks all her books would disappear and she'd
have to get on the phone again to Amazon to beg to have them back.
It just isn't worth the hassle. Besides, Kindles can't read public
library books (those are mostly EPUBs with Adobe DRM); and a Kindle
would require converting my whole EPUB library to MOBI. I don't see
any up side, and a lot of down side.
The Adobe scheme used by Kobo and Nook is better, but I still plan to
avoid books with DRM as much as possible. It's not the stores' fault,
and I hope Kobo does well, because they look like a good company.
It's the publishers who insist on DRM. We can only hope that some day
they come to their senses, like music publishers finally did with MP3
versus DRMed music. A few publishers have dropped DRM already, and if
we readers avoid buying DRMed ebooks, maybe the message will
eventually get through.
Tags: epub, ebook, kobo, nook, tablet
[
17:04 Aug 26, 2015
More tech |
permalink to this entry |
]
Thu, 20 Aug 2015
Three years ago I wanted a way to manage tags on e-books in a
lightweight way,
without having to maintain a Calibre database and fire up the
Calibre GUI app every time I wanted to check a book's tags.
I couldn't find anything, nor did I find any relevant Python
libraries, so I reverse engineered the (simple, XML-bsaed)
EPUB format and wrote a
Python
script to show or modify epub tags.
I've been using that script ever since. It's great for Project
Gutenberg books, which tend to be overloaded with tags that I don't
find very useful for categorizing books
("United States -- Social life and customs -- 20th century -- Fiction")
but lacking in tags that I would find useful ("History", "Science Fiction",
"Mystery").
But it wasn't easy to include it in other programs. For the last week
or so I've been fiddling with a Kobo ebook reader, and I wanted to
write programs that could read epub and also speak Kobo-ese. (I'll
write separately about the joys of Kobo hacking. It's really a neat
little e-reader.)
So I've factored my epubtag script into a usable Python module, so
as well as being a standalone program for viewing epub book data,
it's easy to use from other programs. It's available on GitHub:
epubtag.py:
parse EPUB metadata and view or change subject tags.
Tags: epub, ebook, programming, python, kobo
[
20:27 Aug 20, 2015
More programming |
permalink to this entry |
]