I record track files in OsmAnd
for most of my hikes. But about a quarter of the
time, when I get back to the car, I forget to turn tracking off.
So I end up with a track file that shows the hike plus at least
part of the car trip afterward. Not very useful for purposes like
calculating mileage.
My PyTopo can do simple
track edits, like splitting the track into two or deleting from some
point to the end. But most of the time it's easier just to edit the
GPX file with a text editor.
Eesh, edit GPX directly?? Sounds like something you oughtn't
do, doesn't it? But actually, GPX (a form of
XML) is human readable
and editable. And this specific case, separating the walking from the
car trip in an OsmAnd track file, is particularly easy, because OsmAnd
helpfully adds a speed to every point it saves.
These instructions seem long, but really, once you've done it once,
you'll realize that it's all very straightforward; explaining the steps
is harder than doing them.
Read more ...
Tags: mapping, osmand, grep
[
14:13 Jan 07, 2022
More mapping |
permalink to this entry |
]
One more brief followup on that
comma
inserting sed pattern and its
followup:
$ echo 20130607215015 | sed ':a;s/\b\([0-9]\+\)\([0-9]\{3\}\)\b/\1,\2/;ta'
20,130,607,215,015
In the second article, I'd mentioned that the hardest part of the exercise
was figuring out where we needed backslashes.
Devdas (f3ew) asked on Twitter
whether I would still need all the backslash escapes even
if I put the pattern in a file -- in other worse, are the backslashes
merely to get the shell to pass special characters unchanged?
A good question, and I suspected the need for some of the backslashes
would disappear. So I tried this:
$ echo ':a;s/\b\([0-9]\+\)\([0-9]\{3\}\)\b/\1,\2/;ta' >/tmp/commas
$ echo 20130607215015 | sed -f /tmp/commas
And it didn't work. No commas were inserted.
The problem, it turns out, is that my shell, zsh, changed both instances
of \b to an ASCII backspace, ^H. Editing the file fixes that, and so does
$ echo -E ':a;s/\b\([0-9]\+\)\([0-9]\{3\}\)\b/\1,\2/;ta' >/tmp/commas
But that only applies to echo: zsh doesn't do the \b -> ^H substitution
in the original command, where you pass the string directly as a sed argument.
Okay, with that straightened out, what about Devdas' question?
Surprisingly, it turns out that all the backslashes are still needed.
None of them go away when you echo > file
, so they
weren't there just to get special characters past the shell; and if
you edit the file and try removing some of the backslashes, you'll
see that the pattern no longer works. I had thought at least some of them,
like the ones before the \{ \}, were extraneous, but even those are
still needed.
Filtering unprintable characters
As long as I'm writing about regular expressions, I learned a nice
little tidbit last week. I'm getting an increasing
flood of Asian-language spams which my mail ISP doesn't filter out (they
use spamassassin, which is pretty useless for this sort of filtering).
I wanted a simple pattern I could pass to egrep (via procmail) that
would filter out anything with a run of more than 4 unprintable characters
in a row. [^[:print:]]{4,}
should do it, but it wasn't working.
The problem, it turns out, is the definition of what's printable.
Apparently when the default system character set is UTF-8, just about
everything is considered printable! So the trick is that you need to
set LC_ALL to something more restrictive, like C (which basically means
ASCII) to before :print: becomes useful for language-based filtering.
(Thanks to Mikachu for spotting the problem).
So in a terminal, you can do something like
LC_ALL=C egrep -v '[^[:print:]]' filename
In procmail it was a little harder; I couldn't figure out any way to
change LC_ALL from a procmail recipe; the only solution I came up
with was to add this to ~/.procmailrc:
export LC_ALL=C
It does work, though, and has cut the spam load by quite a bit.
Tags: zsh, regexp, sed, cmdline, grep
[
19:35 Jul 24, 2013
More linux/cmdline |
permalink to this entry |
]