Shallow Thoughts : : Oct

Akkana's Musings on Open Source Computing and Technology, Science, and Nature.

Thu, 24 Oct 2013

A present on the doorstep

When cat owners talk about their cats, they often tell stories about opening the door to find a "present" on the mat of a half-eaten dead bird or mouse, proudly displayed as evidence of the cat's successful night of hunting.

We don't have cats, but this morning I opened the back door to find a present on the doormat:

[Dead guava present on the doormat]

[guava eaten by a squirrel] We don't have cats, but we do have squirrels. Currently, our main visitor is a young all-black male I call Little Blackie, after the pony in True Grit. LB visits daily to take our walnut offerings, and is diligent about chasing away any interlopers who might come by. I watched a wonderful aerial chase yesterday morning, when LB leapt after a grey who fled through the trees and upside-down along a thin telephone line.

Anyway, it's guava season now, though the harvest is sparse and small, not the bounty we get most years. (Note to self: turns out guava trees actually need watering in drought years.)

Squirrels don't normally like any sort of fruit very much, or at least our spoiled eastern greys don't. But occasionally in October one will develop a taste for guava, and we'll find a few scraped-out guavas lying near the tree each day.

This is the first time we've found a dead guava left as a present on the mat, though.

Tags: ,
[ 11:09 Oct 24, 2013    More humor | permalink to this entry | comments ]

Mon, 07 Oct 2013

Viewing HTML mail messages from Mutt (or other command-line mailers)

Command-line mailers like mutt have one disadvantage: viewing HTML mail with embedded images. Without images, HTML mail is no problem -- run it through lynx, links or w3m. But if you want to see images in place, how do you do it?

Mutt can send a message to a browser like firefox ... but only the textual part of the message. The images don't show up.

That's because mail messages include images, not as separate files, but as attachments within the same file, encoded it a format known as MIME (Multipurpose Internet Mail Extensions). An image link in the HTML, instead of looking like <img src="picture.jpg">., will instead look something like <img src="cid:0635428E-AE25-4FA0-93AC-6B8379300161">. (Apple's Mail.app) or <img src="cid:1.3631871432@web82503.mail.mud.yahoo.com">. (Yahoo's webmail).

CID stands for Content ID, and refers to the ID of the image as it is encoded in MIME inside the image. GUI mail programs, of course, know how to decode this and show the image. Mutt doesn't.

A web search finds a handful of shell scripts that use the munpack program (part of the mpack package on Debian systems) to split off the files; then they use various combinations of sed and awk to try to view those files. Except that none of the scripts I found actually work for messages sent from modern mailers -- they don't decode the CID links properly.

I wasted several hours fiddling with various shell scripts, trying to adjust sed and awk commands to figure out the problem, when I had the usual epiphany that always eventually arises from shell script fiddling: "Wouldn't this be a lot easier in Python?"

Python's email package

Python has a package called email that knows how to list and unpack MIME attachments. Starting from the example near the bottom of that page, it was easy to split off the various attachments and save them in a temp directory. The key is

import email

fp = open(msgfile)
msg = email.message_from_file(fp)
fp.close()

for part in msg.walk():

That left the problem of how to match CIDs with filenames, and rewrite the links in the HTML message accordingly.

The documentation on the email package is a bit unclear, unfortunately. For instance, they don't give any hints what object you'll get when iterating over a message with walk, and if you try it, they're just type 'instance'. So what operations can you expect are legal on them? If you run help(part) in the Python console on one of the parts you get from walk, it's generally class Message, so you can use the Message API, with functions like get_content_type(), get_filename(). and get_payload().

More useful, it has dictionary keys() for the attributes it knows about each attachment. part.keys() gets you a list like

['Content-Type', 
 'Content-Transfer-Encoding',
 'Content-ID',
 'Content-Disposition' ]

So by making a list relating part.get_filename() (with a made-up filename if it doesn't have one already) to part['Content-ID'], I'd have enough information to rewrite those links.

Case-insensitive dictionary matching

But wait! Not so simple. That list is from a Yahoo mail message, but if you try keys() on a part sent by Apple mail, instead if will be 'Content-Id'. Note the lower-case d, Id, instead of the ID that Yahoo used.

Unfortunately, Python doesn't have a way of looking up items in a dictionary with the key being case-sensitive. So I used a loop:

    for k in part.keys():
        if k.lower() == 'content-id':
            print "Content ID is", part[k]

Most mailers seem to put angle brackets around the content id, so that would print things like "Content ID is <14.3631871432@web82503.mail.mud.yahoo.com>". Those angle brackets have to be removed, since the CID links in the HTML file don't have them.

for k in part.keys():
    if k.lower() == 'content-id':
        if part[k].startswith('<') and part[k].endswith('>'):
            part[k] = part[k][1:-1]

But that didn't work -- the angle brackets were still there, even though if I printed part[k][1:-1] it printed without angle brackets. What was up?

Unmutable parts inside email.Message

It turned out that the parts inside an email Message (and maybe the Message itself) are unmutable -- you can't change them. Python doesn't throw an exception; it just doesn't change anything. So I had to make a local copy:

for k in part.keys():
    if k.lower() == 'content-id':
        content_id = part[k]
        if content_id.startswith('<') and content_id.endswith('>'):
            content_id = content_id[1:-1]
and then save content_id, not part[k], in my list of filenames and CIDs.

Then the rest is easy. Assuming I've built up a list called subfiles containing dictionaries with 'filename' and 'Content-Id', I can do the substitution in the HTML source:

    htmlsrc = html_part.get_payload(decode=True)
    for sf in subfiles:
        htmlsrc = re.sub('cid: ?' + sf['Content-Id'],
                         'file://' + sf['filename'],
                         htmlsrc, flags=re.IGNORECASE)

Then all I have to do is hook it up to a key in my .muttrc:

# macro  index  <F10>  "<copy-message>/tmp/mutttmpbox\n<enter><shell-escape>~/bin/viewhtmlmail.py\n" "View HTML in browser"
# macro  pager  <F10>  "<copy-message>/tmp/mutttmpbox\n<enter><shell-escape>~/bin/viewhtmlmail.py\n" "View HTML in browser"

Works nicely! Here's the complete script: viewhtmlmail.

Tags: , , , , ,
[ 11:49 Oct 07, 2013    More tech/email | permalink to this entry | comments ]

Wed, 02 Oct 2013

Disgruntled by Grackles

On a trip last month, Mesquite, NV gave us couple of avian delights.

First the roadrunner, strutting around a side street poking its head into bushes, hunting as we watched from the car.

[Huge flock of grackles] Then, in the evening, a convocation of grackles -- several hundred of them -- in the tree just across from the third-floor balcony at our casino hotel. All chattering with each other, making an amazing variety of noises as they flew from branch to branch, occasionally bickering or feeding each other or landing on a branch too weak to support them.

Grackles make some amazing sounds. We don't have them at home, so I only hear them on trips, but they always want to make me look for the amplifier and speakers -- it seems impossible that a medium-sized bird could be making all that sound, and such a variety of noise, all by itself.

We stood there for maybe 20 minutes, watching them and listening, shooting photos and video, before the heat (over 100 even after sunset) got to us and we had to go back into the room.

[Don't park under a grackle tree] Unfortunately, in all that time, one thing that never occurred to us was that our car was parked right under that tree. We realized that the next morning.

And we had thought we were so clever, finding the one shady spot in that parking lot!

Tags: , ,
[ 14:38 Oct 02, 2013    More travel | permalink to this entry | comments ]