Shallow Thoughts : : Oct

Akkana's Musings on Open Source Computing and Technology, Science, and Nature.

Thu, 31 Oct 2019

Command of the Day: See a Random Command Name and Description

Someone on ##linux was talking about "bro pages", which turns out to be a site that collects random short examples of how to use Linux commands. It reminded me of Command Line Magic, a Twitter account I follow that gives sometimes entertaining or useful command-line snippets.

I hadn't been to that page on the Twitter website in a while (I usually use bitlbee for Twitter), and clicking through some of the tweets on the "Who to follow" accounts took me to someone who'd made a GNU CoreUtils cheat sheet. I didn't really want the printed cheat sheet, but I was interested in the commands used to generate it. The commands involved downloading an HTML page and didn't work any more -- the page was still there but its format has changed -- but that got me to thinking about how it might be fun to generate something that would show me a random command and its description, starting not from coreutils but from the set of all commands I have installed.

I can get a list of commands from the installed man pages in /usr/share/man -- section 1, for basic commands, and section 8, for system-admin commands. (The other sections are for things like library routines, system calls, files etc.)

So I can pick a random man page like this:

ls -1 /usr/share/man/man1/ /usr/share/man/man8 | shuf -n 1
which gives me a filename like xlsfonts.1.gz.

The man pages are troff format, gzipped. You can run zcat on them, but extracting the name and description still isn't entirely trivial. In most cases, it comes right after the .SH NAME line, so you could do something like

zcat $(ls -1 /usr/share/man/man1/* /usr/share/man/man8/* | shuf -n 1) | grep -A1 NAME | tail -1
(the * for the two directories causes ls to list the full pathname, like /usr/share/man/man1/xlsfonts.1.gz, instead of just the filename, xlsfonts.1.gz).

But that doesn't work in every case: sometimes the description is more than one line, or there's a line between the NAME line and the actual description.

A better way is to use apropos (man -k), which already knows how to search through man pages and parse them to extract the command name and description. For that, you need to start with the filename (I'm going to drop those *s from the command since I don't need the full pathname any more) and get rid of everything after the first '.'.

You can do that with sed 's_\.[0-9].*__': it looks for everything starting with a dot (\.) followed by a digit ([0-9] -- sed doesn't understand \d) followed by anything (.*) and replaces all of it with nothing, the empty string.

Here's the full command:

apropos $(ls -1 /usr/share/man/man1/ /usr/share/man/man8 | shuf -n 1 | sed 's_\.[0-9].*__')

Sometimes it will give more than one command: for instance, just now, testing it, it found /usr/share/man/man8/snap.8.gz, pared that down to just snap, and apropos snap found ten different commands. But that's unusual; most of the time you'll just get one or two, and of course you could add another | shuf -n 1 if want to make sure you get only one line.

Update: man -f is a better solution: that will give a single apropos-like description line for only the command picked by the first shuf command.

man -f $(ls -1 /usr/share/man/man1/ /usr/share/man/man8 | shuf -n 1 | sed 's_\.[0-9].*__')

It's kind of a fun way to discover new commands you may not have heard of. I'm going to put it in my .zlogin.

Tags: ,
[ 13:22 Oct 31, 2019    More linux/cmdline | permalink to this entry | ]

Thu, 24 Oct 2019

Fall in Frijoles Canyon

[Tree arch in autumn] We had a lovely hike yesterday down Bandelier's Frijoles Canyon, from Ponderosa to Upper Crossing then down the canyon to the Visitor Center (where we'd left a shuttle car), for a one-way total about eight miles.

Our hiking group had missed the peak of the aspens last week at Santa Barbara, so we were happy to discover that fall is still in full swing in Frijoles.

There are no aspens down this low, but there are plenty of cottonwoods, a related tree that turns the same vibrant yellow. Plus maples, Gambel oaks giving a nice multicolored show this year, and plenty of other trees I can't identify. (For people who really need a dose of aspen, the slopes of Pajarito Mountain still have some nice bright yellow patches.)

[Frijoles natural bridge] We had lunch by the natural bridge: a huge rock slab that collapsed into the opposite canyon wall, probably some time in the last ten years. No one is sure exactly when it happened because the canyon was impassable after the floods of 2013 and not very passable even in the decade before that. But old-timers who used to hike the canyon decades ago swear there was nothing like the natural bridge there, and it's the sort of thing you'd remember. When the trail was cleared out and made passable again a couple of years ago, the bridge was a surprise to everyone. Geology in action in our lifetimes!

[October snow] I started the hike in two sweaters and gloves, but was down to a t-shirt by the time we got to the canyon.

The next day, today, the temperature was around 31 at the house, and it snowed! New Mexico weather is so much more fun than California.

As I write this, there's still snow on the ground, the juncos are going crazy fueling up for winter, the sun is out here but I'm looking down at drifting fog over White Rock Canyon. And the oatmeal cookies just came out of the oven.

More photos from Frijoles Canyon: Frijoles Canyon Trail in October 2019.

Tags:
[ 14:18 Oct 24, 2019    More nature | permalink to this entry | ]

Thu, 10 Oct 2019

RSS was Down, Now Back Up, with Atom

A reader pointed out to me that the RSS page on my blog hadn't been updated since May.

That must be when I updated PyBlosxom to the latest version and switched to the python3 branch. Python 2, as you probably know, is no longer going to be supported starting sometime next year (the exact date in 2020 doesn't seem to be set). Anyway, PyBlosxom was one of the few tools I use that still depend only on Python 2, and since I saw they had a python3 branch, I tried it.

PyBlosxom is no longer a project in active development: as I understand it, some of the developers drifted off to other blogging systems, others decided that it worked well enough and didn't really need further tweaking. But someone, at some point, did make a try at porting it to Python 3; and when I tried the python3 branch, I was able to make it work after I made a couple of very minor tweaks (which I contributed back to the project on GitHub, and they were accepted).

Everything went fine for a few months, until I received the alert that the index.rss and index.rss20 files weren't being generated. Curiously, the RSS files for each individual post are still there; just not the index.rss and index.rss20.

I found there was already a bug filed on that. I tried the workaround mentioned in the bug, at the same time adding Atom to the RSS 0.9.1 and RSS 2.0 flavors I've had in the past. I haven't generated Atom for all the old entries, but any new entries starting now should be available as Atom.

Fingers crossed! if you're reading this, then it worked and my RSS pages are back. Sorry about the RSS hiatus.

Tags: , ,
[ 09:10 Oct 10, 2019    More blogging | permalink to this entry | ]

Tue, 01 Oct 2019

Making Web Maps using Python, Folium and Shapefiles

A friend recently introduced me to Folium, a quick and easy way of making web maps with Python.

The Folium Quickstart gets you started in a hurry. In just two lines of Python (plus the import line), you can write an HTML file that you can load in any browser to display a slippy map, or you can display it inline in a Jupyter notebook.

Folium uses the very mature Leaflet JavaScript library under the hood. But it lets you do all the development in a few lines of Python rather than a lot of lines of Javascript.

Having run through most of the quickstart, I was excited to try Folium for showing GeoJSON polygons. I'm helping with a redistricting advocacy project; I've gotten shapefiles for the voting districts in New Mexico, and have been wanting to build a map that shows them which I can then extend for other purposes.

Step 1: Get Some GeoJSON

The easiest place to get voting district data is from TIGER, the geographic arm of the US Census.

For the districts resulting from the 2010 Decadal Census, start here: Cartographic Boundary Files - Shapefile (you can also get them as KML, but not as GeoJSON). There's a category called "Congressional Districts: 116th Congress", and farther down the page, under "State-based Files", you can get shapefiles for the upper and lower houses of your state.

You can also likely download them from at www2.census.gov/geo/tiger/TIGER2010/, as long as you can figure out how to decode the obscure directory names. ELSD and POINTLM, so the first step is to figure out what those mean; I never found anything that could decode them.

(Before I found the TIGER district data, I took a more roundabout path that involved learning how to merge shapes; more on that in a separate post.)

Okay, now you have a shapefile (unzip the TIGER file to get a bunch of files with names like cb_2018_35_sldl_500k.* -- shape "files" are an absurd ESRI concept that actually use seven separate files for each dataset, so they're always packaged as a zip archive and programs that read shapefiles expect that when you pass them a .shp, there will be a bunch of other files with the same basename but different extensions in the same directory).

But Folium can't handle shapefiles, only GeoJSON. You can do that translation with a GDAL command:

ogr2ogr -t_srs EPSG:4326 -f GeoJSON file.json file.shp

Or you can do it programmatically with the GDAL Python bindings:

def shapefile2geojson(infile, outfile, fieldname):
    '''Translate a shapefile to GEOJSON.'''
    options = gdal.VectorTranslateOptions(format="GeoJSON",
                                          dstSRS="EPSG:4326")
    gdal.VectorTranslate(outfile, infile, options=options)

The EPSG:4326 specifier, if you read man ogr2ogr, is supposedly for reprojecting the data into WGS84 coordinates, which is what most web maps want (EPSG:4326 is an alias for WGS84). But it has an equally important function: even if your input shapefile is already in WGS84, adding that option somehow ensures that GDAL will use degrees as the output unit. The TIGER data already uses degrees so you don't strictly need that, but some data, like the precinct data I got from UNM RGIS, uses other units, like meters, which will confuse Folium and Leaflet. And the TIGER data isn't in WGS84 anyway; it's in GRS1980 (you can tell by reading the .prj file in the same directory as the .shp). Don't ask me about details of all these different geodetic reference systems; I'm still trying to figure it all out. Anyway, I recommend adding the EPSG:4326 as the safest option.

Step 2: Show the GeoJSON in a Folium Map

In theory, looking at the Folium Quickstart, all you need is folium.GeoJson(filename, name='geojson').add_to(m). In practice, you'll probably want to more, like

Each of these requires some extra work.

You can color the regions with a style function:

folium.GeoJson(jsonfile, style_function=style_fcn).add_to(m)

Here's a simple style function that chooses random colors:

import random

def random_html_color():
    r = random.randint(0,256)
    g = random.randint(0,256)
    b = random.randint(0,256)
    return '#%02x%02x%02x' % (r, g, b)

def style_fcn(x):
    return { 'fillColor': random_html_color() }

I wanted to let the user choose regions by clicking, but it turns out Folium doesn't have much support for that (it may be coming in a future release). You can do it by reading the GeoJSON yourself, splitting it into separate polygons and making them all separate Folium Polygons or GeoJSON objects, each with its own click behavior; but if you don't mind highlights and popups on mouseover instead of requiring a click, that's pretty easy. For highlighting in red whenever the user mouses over a polygon, set this highlight_function:

def highlight_fcn(x):
    return { 'fillColor': '#ff0000' }

For tooltips:

tooltip = folium.GeoJsonTooltip(fields=['NAME'])
In this case, 'NAME' is the field in the shapefile that I want to display when the user mouses over the region. If you're not sure of the field name, the nice thing about GeoJSON is that it's human readable. Generally you'll want to look inside "features", for "properties" to find the fields defined for each polygon. For instance, if I use jq to prettyprint the JSON generated for the NM state house districts:
$ jq . House.json | less
{
  "type": "FeatureCollection",
  "name": "cb_2018_35_sldl_500k",
  "crs": {
    "type": "name",
    "properties": {
      "name": "urn:ogc:def:crs:OGC:1.3:CRS84"
    }
  },
  "features": [
    {
      "type": "Feature",
      "properties": {
        "STATEFP": "35",
        "SLDLST": "009",
        "AFFGEOID": "620L600US35009",
        "GEOID": "35009",
        "NAME": "9",
        "LSAD": "LL",
        "LSY": "2018",
        "ALAND": 3405159792,
        "AWATER": 5020507
      },
      "geometry": {
        "type": "Polygon",
        "coordinates": [
...

If you still aren't sure which property name means what (for example, "NAME" could be anything), just keep browsing through the JSON file to see which fields change from feature to feature and give the values you're looking for, and it should become obvious pretty quickly.

Here's a working code example: polidistmap.py, and here's an example of a working map:

Tags: , , ,
[ 12:29 Oct 01, 2019    More mapping | permalink to this entry | ]