Accessing image metadata: storing tags inside the image file (Shallow Thoughts)

Akkana's Musings on Open Source Computing and Technology, Science, and Nature.

Thu, 08 Jan 2015

Accessing image metadata: storing tags inside the image file

A recent Slashdot discussion on image tagging and organization a while back got me thinking about putting image tags inside each image, in its metadata.

Currently, I use my MetaPho image tagger to update a file named Tags in the same directory as the images I'm tagging. Then I have a script called fotogr that searches for combinations of tags in these Tags files.

That works fine. But I have occasionally wondered if I should also be saving tags inside the images themselves, in case I ever want compatibility with other programs. I decided I should at least figure out how that would work, in case I want to add it to MetaPho.

I thought it would be simple -- add some sort of key in the images's EXIF tags. But no -- EXIF has no provision for tags or keywords. But JPEG (and some other formats) supports lots of tags besides EXIF. Was it one of the XMP tags?

Web searching only increased my confusion; it seems that there is no standard for this, but there have been lots of pseudo-standards over the years. It's not clear what tag most programs read, but my impression is that the most common is the "Keywords" IPTC tag.

Okay. So how would I read or change that from a Python program?

Lots of Python libraries can read EXIF tags, including Python's own PIL library -- I even wrote a few years ago about reading EXIF from PIL. But writing it is another story.

Nearly everybody points to pyexiv2, a fairly mature library that even has a well-written pyexiv2 tutorial. Great! The only problem with it is that the pyexiv2 front page has a big red Deprecation warning saying that it's being replaced by GExiv2. With a link that goes to a nonexistent page; and Debian doesn't seem to have a package for GExiv2, nor could I find a tutorial on it anywhere.

Sigh. I have to say that pyexiv2 sounds like a much better bet for now even if it is supposedly deprecated.

Following the tutorial, I was able to whip up a little proof of concept that can look for an IPTC Keywords tag in an existing image, print out its value, add new tags to it and write it back to the file.

import sys
import pyexiv2

if len(sys.argv) < 2:
    print "Usage:", sys.argv[0], "imagename.jpg [tag ...]"
    sys.exit(1)

metadata = pyexiv2.ImageMetadata(sys.argv[1])
metadata.read()

newkeywords = sys.argv[2:]

keyword_tag = 'Iptc.Application2.Keywords'
if keyword_tag in metadata.iptc_keys:
    tag = metadata[keyword_tag]
    oldkeywords = tag.value
    print "Existing keywords:", oldkeywords
    if not newkeywords:
        sys.exit(0)
    for newkey in newkeywords:
        oldkeywords.append(newkey)
    tag.value = oldkeywords
else:
    print "No IPTC keywords set yet"
    if not newkeywords:
        sys.exit(0)
    metadata[keyword_tag] = pyexiv2.IptcTag(keyword_tag, newkeywords)

tag = metadata[keyword_tag]
print "New keywords:", tag.value

metadata.write()

Does that mean I'm immediately adding it to MetaPho? No. To be honest, I'm not sure I care very much, since I don't have any other software that uses that IPTC field and no other MetaPho user has ever asked for it. But it's nice to know that if I ever have a reason to add it, I can.

Tags: , , ,
[ 10:28 Jan 08, 2015    More photo | permalink to this entry | ]

Comments via Disqus:

blog comments powered by Disqus