Making Web Maps using Python, Folium and Shapefiles (Shallow Thoughts)

Akkana's Musings on Open Source Computing and Technology, Science, and Nature.

Tue, 01 Oct 2019

Making Web Maps using Python, Folium and Shapefiles

A friend recently introduced me to Folium, a quick and easy way of making web maps with Python.

The Folium Quickstart gets you started in a hurry. In just two lines of Python (plus the import line), you can write an HTML file that you can load in any browser to display a slippy map, or you can display it inline in a Jupyter notebook.

Folium uses the very mature Leaflet JavaScript library under the hood. But it lets you do all the development in a few lines of Python rather than a lot of lines of Javascript.

Having run through most of the quickstart, I was excited to try Folium for showing GeoJSON polygons. I'm helping with a redistricting advocacy project; I've gotten shapefiles for the voting districts in New Mexico, and have been wanting to build a map that shows them which I can then extend for other purposes.

Step 1: Get Some GeoJSON

The easiest place to get voting district data is from TIGER, the geographic arm of the US Census.

For the districts resulting from the 2010 Decadal Census, start here: Cartographic Boundary Files - Shapefile (you can also get them as KML, but not as GeoJSON). There's a category called "Congressional Districts: 116th Congress", and farther down the page, under "State-based Files", you can get shapefiles for the upper and lower houses of your state.

You can also likely download them from at www2.census.gov/geo/tiger/TIGER2010/, as long as you can figure out how to decode the obscure directory names. ELSD and POINTLM, so the first step is to figure out what those mean; I never found anything that could decode them.

(Before I found the TIGER district data, I took a more roundabout path that involved learning how to merge shapes; more on that in a separate post.)

Okay, now you have a shapefile (unzip the TIGER file to get a bunch of files with names like cb_2018_35_sldl_500k.* -- shape "files" are an absurd ESRI concept that actually use seven separate files for each dataset, so they're always packaged as a zip archive and programs that read shapefiles expect that when you pass them a .shp, there will be a bunch of other files with the same basename but different extensions in the same directory).

But Folium can't handle shapefiles, only GeoJSON. You can do that translation with a GDAL command:

ogr2ogr -t_srs EPSG:4326 -f GeoJSON file.json file.shp

Or you can do it programmatically with the GDAL Python bindings:

def shapefile2geojson(infile, outfile, fieldname):
    '''Translate a shapefile to GEOJSON.'''
    options = gdal.VectorTranslateOptions(format="GeoJSON",
                                          dstSRS="EPSG:4326")
    gdal.VectorTranslate(outfile, infile, options=options)

The EPSG:4326 specifier, if you read man ogr2ogr, is supposedly for reprojecting the data into WGS84 coordinates, which is what most web maps want (EPSG:4326 is an alias for WGS84). But it has an equally important function: even if your input shapefile is already in WGS84, adding that option somehow ensures that GDAL will use degrees as the output unit. The TIGER data already uses degrees so you don't strictly need that, but some data, like the precinct data I got from UNM RGIS, uses other units, like meters, which will confuse Folium and Leaflet. And the TIGER data isn't in WGS84 anyway; it's in GRS1980 (you can tell by reading the .prj file in the same directory as the .shp). Don't ask me about details of all these different geodetic reference systems; I'm still trying to figure it all out. Anyway, I recommend adding the EPSG:4326 as the safest option.

Step 2: Show the GeoJSON in a Folium Map

In theory, looking at the Folium Quickstart, all you need is folium.GeoJson(filename, name='geojson').add_to(m). In practice, you'll probably want to more, like

Each of these requires some extra work.

You can color the regions with a style function:

folium.GeoJson(jsonfile, style_function=style_fcn).add_to(m)

Here's a simple style function that chooses random colors:

import random

def random_html_color():
    r = random.randint(0,256)
    g = random.randint(0,256)
    b = random.randint(0,256)
    return '#%02x%02x%02x' % (r, g, b)

def style_fcn(x):
    return { 'fillColor': random_html_color() }

I wanted to let the user choose regions by clicking, but it turns out Folium doesn't have much support for that (it may be coming in a future release). You can do it by reading the GeoJSON yourself, splitting it into separate polygons and making them all separate Folium Polygons or GeoJSON objects, each with its own click behavior; but if you don't mind highlights and popups on mouseover instead of requiring a click, that's pretty easy. For highlighting in red whenever the user mouses over a polygon, set this highlight_function:

def highlight_fcn(x):
    return { 'fillColor': '#ff0000' }

For tooltips:

tooltip = folium.GeoJsonTooltip(fields=['NAME'])
In this case, 'NAME' is the field in the shapefile that I want to display when the user mouses over the region. If you're not sure of the field name, the nice thing about GeoJSON is that it's human readable. Generally you'll want to look inside "features", for "properties" to find the fields defined for each polygon. For instance, if I use jq to prettyprint the JSON generated for the NM state house districts:
$ jq . House.json | less
{
  "type": "FeatureCollection",
  "name": "cb_2018_35_sldl_500k",
  "crs": {
    "type": "name",
    "properties": {
      "name": "urn:ogc:def:crs:OGC:1.3:CRS84"
    }
  },
  "features": [
    {
      "type": "Feature",
      "properties": {
        "STATEFP": "35",
        "SLDLST": "009",
        "AFFGEOID": "620L600US35009",
        "GEOID": "35009",
        "NAME": "9",
        "LSAD": "LL",
        "LSY": "2018",
        "ALAND": 3405159792,
        "AWATER": 5020507
      },
      "geometry": {
        "type": "Polygon",
        "coordinates": [
...

If you still aren't sure which property name means what (for example, "NAME" could be anything), just keep browsing through the JSON file to see which fields change from feature to feature and give the values you're looking for, and it should become obvious pretty quickly.

Here's a working code example: polidistmap.py, and here's an example of a working map:

Tags: , ,
[ 12:29 Oct 01, 2019    More mapping | permalink to this entry | comments ]