Shallow Thoughts : : Oct
Akkana's Musings on Open Source Computing and Technology, Science, and Nature.
Thu, 31 Oct 2019
Someone on ##linux was talking about "bro pages", which turns out to
be a site that collects random short examples of how to use Linux
commands. It reminded me of
Command Line Magic,
a Twitter account I follow that gives sometimes entertaining or useful
command-line snippets.
I hadn't been to that page on the Twitter website in a while (I
usually use bitlbee for Twitter), and clicking through some of the
tweets on the "Who to follow" accounts took me to someone who'd made
a GNU
CoreUtils cheat sheet. I didn't really want the printed cheat
sheet, but I was interested in the commands used to generate it.
The commands involved downloading an HTML page and didn't work any
more -- the page was still there but its format has changed -- but
that got me to thinking about how it might be fun to generate
something that would show me a random command and its description,
starting not from coreutils but from the set of all commands I have
installed.
I can get a list of commands from the installed man pages in
/usr/share/man -- section 1, for basic commands, and section
8, for system-admin commands. (The other sections are for things
like library routines, system calls, files etc.)
So I can pick a random man page like this:
ls -1 /usr/share/man/man1/ /usr/share/man/man8 | shuf -n 1
which gives me a filename like
xlsfonts.1.gz.
The man pages are troff format, gzipped. You can run zcat on
them, but extracting the name and description still isn't entirely
trivial. In most cases, it comes right after the .SH NAME
line, so you could do something like
zcat $(ls -1 /usr/share/man/man1/* /usr/share/man/man8/* | shuf -n 1) | grep -A1 NAME | tail -1
(the * for the two directories causes ls to list the full pathname,
like
/usr/share/man/man1/xlsfonts.1.gz, instead of just the
filename, xlsfonts.1.gz).
But that doesn't work in every case: sometimes the description is more than
one line, or there's a line between the NAME line and the actual description.
A better way is to use apropos (man -k), which already knows how to
search through man pages and parse them to extract the command name and
description. For that, you need to
start with the filename (I'm going to drop those *s from the command since
I don't need the full pathname any more) and get rid of everything
after the first '.'.
You can do that with sed 's_\.[0-9].*__'
:
it looks for everything starting with a dot (\.
) followed
by a digit ([0-9]
-- sed doesn't understand \d
)
followed by anything (.*
) and replaces all of it with nothing,
the empty string.
Here's the full command:
apropos $(ls -1 /usr/share/man/man1/ /usr/share/man/man8 | shuf -n 1 | sed 's_\.[0-9].*__')
Sometimes it will give more than one command: for instance,
just now, testing it, it found /usr/share/man/man8/snap.8.gz,
pared that down to just snap, and apropos snap
found ten different commands. But that's unusual; most of the time
you'll just get one or two, and of course you could add another
| shuf -n 1
if want to make sure you get only one line.
Update: man -f
is a better solution: that will give a single
apropos-like description line for only the command picked by the
first shuf command.
man -f $(ls -1 /usr/share/man/man1/ /usr/share/man/man8 | shuf -n 1 | sed 's_\.[0-9].*__')
It's kind of a fun way to discover new commands you may not have
heard of. I'm going to put it in my .zlogin.
Tags: linux, cmdline
[
13:22 Oct 31, 2019
More linux/cmdline |
permalink to this entry |
]
Thu, 24 Oct 2019
We had a lovely hike yesterday down Bandelier's Frijoles Canyon, from
Ponderosa to Upper Crossing then down the canyon to the Visitor Center
(where we'd left a shuttle car), for a one-way total about eight miles.
Our hiking group had missed the peak of the aspens last week
at Santa Barbara, so we were happy to discover that fall is still
in full swing in Frijoles.
There are no aspens down this low, but there are plenty of
cottonwoods, a related tree that turns the same vibrant yellow.
Plus maples, Gambel oaks giving a nice multicolored show this year,
and plenty of other trees I can't identify.
(For people who really need a dose of aspen, the slopes of
Pajarito Mountain still have some nice bright yellow patches.)
We had lunch by the natural bridge: a huge rock slab that collapsed
into the opposite canyon wall, probably some time in the last ten years.
No one is sure exactly when it happened because the canyon was
impassable after the floods of 2013 and not very passable even in
the decade before that. But old-timers who used to hike the canyon
decades ago swear there was nothing like the natural bridge there, and it's
the sort of thing you'd remember. When the trail was cleared out and
made passable again a couple of years ago, the bridge was a surprise
to everyone. Geology in action in our lifetimes!
I started the hike in two sweaters and gloves, but was down to a t-shirt
by the time we got to the canyon.
The next day, today, the temperature was around 31 at the house, and
it snowed! New Mexico weather is so much more fun than California.
As I write this, there's still snow on the ground, the juncos are going
crazy fueling up for winter, the sun is out here but I'm looking down
at drifting fog over White Rock Canyon. And the oatmeal cookies just
came out of the oven.
More photos from Frijoles Canyon:
Frijoles Canyon Trail in October 2019.
Tags: nature
[
14:18 Oct 24, 2019
More nature |
permalink to this entry |
]
Thu, 10 Oct 2019
A reader pointed out to me that the RSS page on my blog hadn't been
updated since May.
That must be when I updated PyBlosxom to the latest version and
switched to the python3 branch. Python 2, as you probably know,
is no longer going to be supported starting sometime next year
(the exact date in 2020 doesn't seem to be set). Anyway, PyBlosxom
was one of the few tools I use that still depend only on Python 2,
and since I saw they had a python3 branch, I tried it.
PyBlosxom is no longer a project in active development: as I
understand it, some of the developers drifted off to other blogging
systems, others decided that it worked well enough and didn't really
need further tweaking. But someone, at some point, did make a try at
porting it to Python 3; and when I tried the python3 branch, I was
able to make it work after I made a couple of very minor tweaks (which
I contributed back to the project on GitHub, and they were accepted).
Everything went fine for a few months, until I received the alert that
the index.rss and index.rss20 files weren't being generated.
Curiously, the RSS files for each individual post are still there;
just not the index.rss and index.rss20.
I found there was already a
bug filed
on that.
I tried the workaround mentioned in the bug, at the same time adding
Atom to the RSS 0.9.1 and RSS 2.0 flavors I've had in the past.
I haven't generated Atom for all the old entries, but any new entries
starting now should be available as Atom.
Fingers crossed! if you're reading this, then it worked and my RSS
pages are back. Sorry about the RSS hiatus.
Tags: blogging, pyblosxom, rss
[
09:10 Oct 10, 2019
More blogging |
permalink to this entry |
]
Tue, 01 Oct 2019
A friend recently introduced me to Folium, a quick and easy way of
making web maps with Python.
The Folium
Quickstart gets you started in a hurry. In just two lines of Python
(plus the import line), you can write an HTML file that you can load
in any browser to display a slippy map, or you can display it inline in a
Jupyter notebook.
Folium uses the very mature Leaflet
JavaScript library under the hood. But it lets you do all the
development in a few lines of Python rather than a lot of lines
of Javascript.
Having run through most of the quickstart, I was excited to try
Folium for showing GeoJSON polygons. I'm helping with a redistricting
advocacy project; I've gotten shapefiles for the voting districts in
New Mexico, and have been wanting to build a map that shows them
which I can then extend for other purposes.
Step 1: Get Some GeoJSON
The easiest place to get voting district data is from TIGER, the
geographic arm of the US Census.
For the districts resulting from the 2010 Decadal Census,
start here:
Cartographic
Boundary Files - Shapefile (you can also get them as KML,
but not as GeoJSON). There's a category called
"Congressional Districts: 116th Congress", and farther down the page,
under "State-based Files", you can get shapefiles for the upper and
lower houses of your state.
You can also likely download them from
at www2.census.gov/geo/tiger/TIGER2010/,
as long as you can figure out how to decode the obscure directory names.
ELSD and POINTLM, so
the first step is to figure out what those mean; I never found anything
that could decode them.
(Before I found the TIGER district data, I took a more roundabout path that
involved learning how to merge shapes; more on that in a separate post.)
Okay, now you have a shapefile (unzip the TIGER file to get a bunch of
files with names like cb_2018_35_sldl_500k.* -- shape "files"
are an absurd ESRI concept that actually use seven separate files for
each dataset, so they're always packaged as a zip archive and programs
that read shapefiles expect that when you pass them a .shp,
there will be a bunch of other files with the same basename but
different extensions in the same directory).
But Folium can't handle shapefiles, only GeoJSON. You can do that
translation with a GDAL command:
ogr2ogr -t_srs EPSG:4326 -f GeoJSON file.json file.shp
Or you can do it programmatically with the GDAL Python bindings:
def shapefile2geojson(infile, outfile, fieldname):
'''Translate a shapefile to GEOJSON.'''
options = gdal.VectorTranslateOptions(format="GeoJSON",
dstSRS="EPSG:4326")
gdal.VectorTranslate(outfile, infile, options=options)
The EPSG:4326 specifier, if you read man ogr2ogr, is supposedly
for reprojecting the data into WGS84 coordinates, which is what most
web maps want (EPSG:4326 is an alias for WGS84). But it has an equally
important function: even if your input shapefile is already in WGS84,
adding that option somehow ensures that GDAL will use degrees as the
output unit. The TIGER data already uses degrees so you don't strictly
need that, but some data, like the precinct data I got from UNM RGIS,
uses other units, like meters, which will confuse Folium and Leaflet.
And the TIGER data isn't in WGS84 anyway; it's in GRS1980 (you can tell
by reading the .prj file in the same directory as the .shp).
Don't ask me about details of all these different geodetic reference systems;
I'm still trying to figure it all out. Anyway, I recommend adding the
EPSG:4326 as the safest option.
Step 2: Show the GeoJSON in a Folium Map
In theory, looking at the Folium Quickstart, all you need is
folium.GeoJson(filename, name='geojson').add_to(m)
.
In practice, you'll probably want to more, like
- color different regions differently
- show some sort of highlight when the user chooses a region
- show which region is which
Each of these requires some extra work.
You can color the regions with a style function:
folium.GeoJson(jsonfile, style_function=style_fcn).add_to(m)
Here's a simple style function that chooses random colors:
import random
def random_html_color():
r = random.randint(0,256)
g = random.randint(0,256)
b = random.randint(0,256)
return '#%02x%02x%02x' % (r, g, b)
def style_fcn(x):
return { 'fillColor': random_html_color() }
I wanted to let the user choose regions by clicking, but it turns out
Folium doesn't have much support for that (it may be coming in a
future release). You can do it by reading the GeoJSON yourself,
splitting it into separate polygons and making them all separate Folium
Polygons or GeoJSON objects, each with its own click behavior; but
if you don't mind highlights and popups on mouseover instead of
requiring a click, that's pretty easy. For highlighting in red whenever
the user mouses over a polygon, set this highlight_function:
def highlight_fcn(x):
return { 'fillColor': '#ff0000' }
For tooltips:
tooltip = folium.GeoJsonTooltip(fields=['NAME'])
In this case, 'NAME' is the field in the shapefile that I want to display
when the user mouses over the region.
If you're not sure of the field name, the nice thing about GeoJSON
is that it's human readable. Generally you'll want to look inside
"features", for "properties" to find the fields defined for each polygon.
For instance, if I use jq to prettyprint the JSON generated for the NM
state house districts:
$ jq . House.json | less
{
"type": "FeatureCollection",
"name": "cb_2018_35_sldl_500k",
"crs": {
"type": "name",
"properties": {
"name": "urn:ogc:def:crs:OGC:1.3:CRS84"
}
},
"features": [
{
"type": "Feature",
"properties": {
"STATEFP": "35",
"SLDLST": "009",
"AFFGEOID": "620L600US35009",
"GEOID": "35009",
"NAME": "9",
"LSAD": "LL",
"LSY": "2018",
"ALAND": 3405159792,
"AWATER": 5020507
},
"geometry": {
"type": "Polygon",
"coordinates": [
...
If you still aren't sure which property name means what (for example,
"NAME" could be anything), just keep browsing through the JSON file to
see which fields change from feature to feature and give the values
you're looking for, and it should become obvious pretty quickly.
Here's a working code example:
polidistmap.py,
and here's an example of a working map:
Tags: mapping, GIS, python, programming
[
12:29 Oct 01, 2019
More mapping |
permalink to this entry |
]