Shallow Thoughts : : Jun
Akkana's Musings on Open Source Computing and Technology, Science, and Nature.
Sat, 24 Jun 2017
Someone forwarded me a message from the Albuquerque Journal.
It was all about "New Mexico\222s schools".
Sigh. I thought I'd gotten all my Mutt charset problems fixed long
ago. My system locale is set to en_US.UTF-8, and accented characters
in Spanish and in people's names usually show up correctly.
But I do see this every now and then.
When I see it, I usually assume it's a case of incorrect encoding:
whoever sent it perhaps pasted characters from a Windows Word document
or something, and their mailer didn't properly re-encode them into
the charset they were using to send the message.
In this case, the message had
User-Agent: SquirrelMail/1.4.13
.
I suspect it came from a "Share this" link on the newspaper's website.
I used vim to look at the source of the message, and it had
Content-Type: text/plain; charset=iso-8859-1
For the bad characters, in vim I saw things like
New Mexico<92>s schools
I checked an old web page I'd bookmarked years ago that had a table
of the iso-8859-1 characters, and sure enough, hex 0x92 was an apostrophe.
What was wrong?
I got some help on the #mutt IRC channel, and, to make a long story
short, that web table I was using was wrong.
ISO-8859-1 doesn't include any characters in the range 8x-9x,
as you can see on
the Wikipedia
ISO/IEC 8859-1.
What was happening was that the page was really cp1252: that's where
those extra characters, like hex 92/octal 222 for an apostrophe,
or hex 96/octal 226 for a dash (nitpick: that's an en dash, but it
was used in a context that called for an em dash; if someone is going
to use something other than the plain old ASCII dash - you'd think
they'd at least use the right one. Sheesh!)
Anyway, the fix for this is to tell mutt when it sees iso-8859-1,
use cp1252 instead:
charset-hook iso-8859-1 cp1252
Voilà! Now I could read the article about
New Mexico's schools.
A happy find related to this: it turns out there's a better way of
looking up ISO-8859 tables, and I can ditch that bookmark to the old,
erroneous page. I've known about man ascii
forever, but
someone I'd never thought to try other charsets. Turns out
man iso_8859-1
and man iso_8859-15
have built-in tables too. Nice!
(Sadly, man utf-8
doesn't give a table. Of course,
that would be a long man page, if it did!)
Tags: mutt, charsets, linux
[
11:06 Jun 24, 2017
More linux |
permalink to this entry |
]
Fri, 16 Jun 2017
We've had a pair of ash-throated flycatchers in the nest box I set up
in the yard. I've been watching them bring bugs to the nest for a
couple of weeks now, but this morning they've been acting unusual:
fluttering around the corner of the house near my office window,
calling to each other, not spending nearly as much time near the nest.
I suspect one or more of the chicks may have fledged this morning,
though I have yet to see more than two flycatchers at once.
They still return to the nest box occasionally (one of them just
delivered a big grasshopper), so not all the chicks have fledged yet.
Maybe if I'm lucky I'll get to see one fledge.
I hope they're not too affected by the smoky air. We have two fires
filling the air with smoke: the Bonita Fire, 50 miles north, and
as of yesterday a new fire in Jemez Springs, only about half
that distance. Yesterday my eyes were burning, my allergies were
flaring up, and the sky was worse than the worst days in Los Angeles
in the 70s. But it looks like the firefighters have gotten a handle
on both fires; today is still smoky, with a major haze down in the
Pojoaque Valley and over toward Albuquerque, but the sky above is
blue and the smoke plume from Jemez Springs is a lot smaller and
less dark than it was yesterday. Fingers crossed!
And just a few minutes ago, a buck with antlers in velvet wandered
into our garden to take a drink at the pond. Such a nice change
from San Jose!
Tags: nature, birds
[
10:40 Jun 16, 2017
More nature |
permalink to this entry |
]
Fri, 09 Jun 2017
Back in 2006, I wrote an article on
making
a modified copy of sgml-mode.el to make it possible to use
double-dashed clauses -- like this -- in HTML without messing up
auto-fill mode.
That worked, but the problem is that if you use your own copy of
sgml-mode.el, you miss out on any other improvements to
HTML and SGML mode. There have been some good ones, like smarter
rewrap of paragraphs.
I had previously tried lots of ways of customizing
sgml-mode without actually replacing it, but never found a way.
Now, in emacs 24.5.1, I've found a easier way that seems to work.
The annoying mis-indentation comes from the function
sgml-comment-indent-new-line, which sets variables
comment-start, comment-start-skip and comment-end
and then calls comment-indent-new-line.
All I had to do was redefine sgml-comment-indent-new-line
to call comment-indent-new-line without first defining the
comment characters:
(defun sgml-comment-indent-new-line (&optional soft)
(comment-indent-new-line soft))
Finding emacs source
I wondered if it might be better to call whatever underlying
indent-new-line function comment-indent-new-line calls,
or maybe just to call (newline-and-indent).
But how to find the code of comment-indent-new-line?
Happily, describe-function (on C-h f, or if like me
you use C-h for backspace, try F-1 h) tells you exactly what
file defines a function, and it even gives you a link to click on
to view the source. Wonderful!
It turned out just calling (newline-and-indent) wasn't enough,
because sgml-comment-indent-new-line typically
calls comment-indent-new-line when you've typed a space on the
end of a line, and that space gets wrapped and then messes up
indentation. But you can fix that by copying just a couple of lines
from the source of comment-indent-new-line:
(defun sgml-comment-indent-new-line (&optional soft)
(save-excursion (forward-char -1) (delete-horizontal-space))
(delete-horizontal-space)
(newline-and-indent))
That's a little longer than the other definition, but it's cleaner
since comment-indent-new-line is doing all sorts of extra work
you don't need if you're not handling comments.
I'm not sure that both of the delete-horizontal-space lines are
needed: the documentation for delete-horizontal-space says it deletes
both forward and backward. But I have to assume they had a good reason
for having both: maybe the (forward-char -1) is to guard
against spurious spaces already having been inserted in the next line.
I'm keeping it, to be safe.
Tags: emacs, editors
[
11:16 Jun 09, 2017
More linux/editors |
permalink to this entry |
]
Mon, 05 Jun 2017
I know, I know. We use mailers like mutt because we don't believe in
HTML mail and prefer plaintext. Me, too.
But every now and then a situation comes up where it would be useful
to send something with emphasis. Or maybe you need to highlight
changes in something. For whatever reason, every now and then
I wish I had a way to send HTML mail.
I struggled with that way back, never did find a way, and ended up
writing a
Python
script, htmlmail.py to send an HTML page, including images, as email.
Sending HTML Email
But just recently I found a neat mutt hack. It turns out it's quite
easy to send HTML mail.
First, edit the HTML source in your usual mutt message editor (or
compose the HTML some other way, and insert the file). Note: if
there's any quoted text, you'll have to put a <pre> around
it, or otherwise turn it into something that will display nicely
in HTML.
Write the file and exit the editor. Then,
in the Compose menu, type Ctrl-T to edit the attachment type.
Change the type from text/plain to text/html.
That's it! Send it, and it will arrive looking like a regular HTML
email, just as if you'd used one of them newfangled gooey mail clients.
(No inline images, though.)
Viewing HTML Email
Finding out how easy that was made me wonder why the other direction
isn't easier. Of course, I have my mailcap set up so that mutt uses
lynx automatically to view HTML email:
text/html; lynx -dump %s; nametemplate=%s.html; copiousoutput
Lynx handles things like paragraph breaks and does in okay job of
showing links; but it completely drops all emphasis, like bold,
italic, headers, and colors. My terminal can display all those styles
just fine. I've also tried links, elinks, and w3m, but none of them
seem to be able to handle any text styling.
Some of them will do bold if you run them interactively, but none
of them do italic or colors, and none of them will do bold with -dump,
even if you tell them what terminal type you want to use.
Why is that so hard?
I never did find a solution, but it's worth noting some useful
sites I found along the way. Like tips for
testing
bold, italics etc. in a terminal:, and for
testing
whether the terminal supports italics, which gave me these useful
shell functions:
echo -e "\e[1mbold\e[0m"
echo -e "\e[3mitalic\e[0m"
echo -e "\e[4munderline\e[0m"
echo -e "\e[9mstrikethrough\e[0m"
echo -e "\e[31mHello World\e[0m"
echo -e "\x1B[31mHello World\e[0m"
ansi() { echo -e "\e[${1}m${*:2}\e[0m"; }
bold() { ansi 1 "$@"; }
italic() { ansi 3 "$@"; }
underline() { ansi 4 "$@"; }
strikethrough() { ansi 9 "$@"; }
red() { ansi 31 "$@"; }
And in testing, I found that a lot of fonts didn't offer italics.
One that does is Terminus, so if your normal font doesn't,
you can run a terminal with Terminus:
xterm -fn '-*-terminus-bold-*-*-*-20-*-*-*-*-*-*-*'
Not that it matters since none of the text-mode browsers offer italic
anyway. But maybe you'll find some other use for italic in a terminal.
Tags: mutt, email, linux
[
18:28 Jun 05, 2017
More linux |
permalink to this entry |
]