More Tips on International Input (Shallow Thoughts)

Akkana's Musings on Open Source Computing and Technology, Science, and Nature.

Sat, 01 Dec 2007

More Tips on International Input

With what I learned last week, I've been able to type accented characters into GTK apps such as xchat, and a few other apps such as emacs. That's nice -- but I was still having trouble reading accented characters in mutt, or writing them in vim to send through mutt (darn terminal apps).

The biggest problem was the terminal. I was using urxvt, but it turns out that urxvt won't let me type any nonascii characters. It just ignores my multi-key sequences, or prints a space instead of the character I wanted. I have no idea why, but switching to plain ol' xterm solved that problem. Of course, I had to make sure that I was using a font that supported the characters I wanted (ISO 8859-1 or 8859-15 or something similar), which leaves out my favorite terminal font (Schumacher Clean bold), but Bitstream Vera Sans Mono bold is almost as readable.

Of course, it's important to have your locale variables set appropriately. There are several locale variables:

LC_CTYPE
Which encodings to use for typing and displaying characters.
LC_MESSAGES
Which translations to use, in programs that offer them.
LC_COLLATE
How to sort alphabetically (this one also affects whether ls groups capitalized filenames first).
LC_ALL
Overrides any of the others.
LANG
The default, in case none of the other variables is set.
There are a few others which control very specific features like time, numbers, money, addresses and paper size: type locale to see all of them.

Once I switched to xterm, I was able to set either LANG or LC_CTYPE to either en_US.UTF-8 or en_US.ISO-8859-1. I set LC_COLLATE and LANG or LC_MESSAGES to C, so that I get the default (usually US) translations for programs and so that ls groups all the capitalized files first.

Along the way, I learned about yet another way to type accented characters.

setxkbmap -model pc104 -layout us -variant intl
switches to an international layout, at which point typing certain punctuation (like ' or ~) is assumed to be a prefix key. So instead of typing [Multi] ~ n, I can just type ~ n. The catch: it makes it harder to type quotes or tildes by themselves (you have to type a space after the quote or tilde).

Even faster, the international layout also offers shortcuts to many common characters with the "AltGr" key, which I'd heard about for years but never knew how to enable. AltGr is the right alt key, and typing, say, AltGr followed by n gives an ñ. You can see a full map at Wikipedia (AltGr characters are blue, quote prefixes are red).

To get back to a US non-international layout:

setxkbmap -model pc104 -layout us

Of course, these aren't the only keyboard layouts to choose from -- there are lots, plus you can define your own. And I was going to write a little bit about that, except it turns out they've changed it all around again since I last did that two years ago (don't you love the digital world?). So that will have to wait for another time.

But the place to start exploring is /usr/share/X11/xkb. The file symbols/us contains the definitions for those US keyboards, and I believe it's included via the files in the rules directory, probably rules/base, base.xml and base.lst. From there you're on your own. But the standard layouts probably follow the ones in the Wikipedia article on keyboard layouts

Tags: , ,
[ 16:48 Dec 01, 2007    More linux | permalink to this entry | ]

Comments via Disqus:

blog comments powered by Disqus