Python module for reading EPUB e-book metadata (Shallow Thoughts)

Akkana's Musings on Open Source Computing and Technology, Science, and Nature.

Thu, 20 Aug 2015

Python module for reading EPUB e-book metadata

Three years ago I wanted a way to manage tags on e-books in a lightweight way, without having to maintain a Calibre database and fire up the Calibre GUI app every time I wanted to check a book's tags. I couldn't find anything, nor did I find any relevant Python libraries, so I reverse engineered the (simple, XML-bsaed) EPUB format and wrote a Python script to show or modify epub tags.

I've been using that script ever since. It's great for Project Gutenberg books, which tend to be overloaded with tags that I don't find very useful for categorizing books ("United States -- Social life and customs -- 20th century -- Fiction") but lacking in tags that I would find useful ("History", "Science Fiction", "Mystery").

But it wasn't easy to include it in other programs. For the last week or so I've been fiddling with a Kobo ebook reader, and I wanted to write programs that could read epub and also speak Kobo-ese. (I'll write separately about the joys of Kobo hacking. It's really a neat little e-reader.)

So I've factored my epubtag script into a usable Python module, so as well as being a standalone program for viewing epub book data, it's easy to use from other programs. It's available on GitHub: epubtag.py: parse EPUB metadata and view or change subject tags.

Tags: , , , ,
[ 20:27 Aug 20, 2015    More programming | permalink to this entry | ]

Comments via Disqus:

blog comments powered by Disqus