Escaping HTML characters in Emacs (and how to do replaces in elisp) (Shallow Thoughts)

Akkana's Musings on Open Source Computing and Technology, Science, and Nature.

Wed, 01 Dec 2010

Escaping HTML characters in Emacs (and how to do replaces in elisp)

Last week I found myself writing another article that includes code snippets in HTML.

So what, you ask? The problem is, when you're writing articles in HTML, every time you include a code snippet inside a <pre> tag you invariably forget that special characters like < > & have special meanings in HTML, and must be escaped. Every < has to change to &lt;, and so forth, after you paste the code.

In vi/vim, replacing characters is straightforward. But I usually write longer articles in emacs, for various unimportant reasons, and although emacs has global replace, it only works from wherever you are now (called "point" in emacs lingo) to the end of the file. So if you're trying to fix something you pasted in the middle of the article, you can't do it with normal emacs replace.

Surely this is a wheel that has already been re-invented a thousand times, I thought! But googling and asking emacs experts turned up nothing. Looks like I'd have to write it.

And that turned out to be more difficult than I expected, for the same reason: emacs replace-string works the same way from a program as it does interactively, and replaces from point to the end of the file, and there's no way to restrict it to a more limited range.

Several helpful people on #emacs chimed in with ideas, but most of them didn't pan out. But ggole knew a way to do it that was both clean and reliable (thanks!).

Here's the elisp function I ended up with. It uses save-excursion to put the cursor back where it started before you ran the function, narrow-to-region to make replace-string work only on the region, and save-restriction get rid of that narrow-to-region after we're done. Nice!

(defun unhtml (start end)
  (interactive "r")
  (save-excursion
    (save-restriction
      (narrow-to-region start end)
      (goto-char (point-min))
      (replace-string "&" "&amp;")
      (goto-char (point-min))
      (replace-string "<" "&lt;")
      (goto-char (point-min))
      (replace-string ">" "&gt;")
      )))

And yes, I used it just now on that elisp snippet.

Tags: , ,
[ 20:08 Dec 01, 2010    More linux/editors | permalink to this entry | ]

Comments via Disqus:

blog comments powered by Disqus