Shallow Thoughts : : programming

Akkana's Musings on Open Source Computing and Technology, Science, and Nature.

Sat, 14 Jan 2017

Plotting election (and other county-level) data with Python Basemap

After my arduous search for open 2016 election data by county, as a first test I wanted one of those red-blue-purple charts of how Democratic or Republican each county's vote was.

I used the Basemap package for plotting. It used to be part of matplotlib, but it's been split off into its own toolkit, grouped under mpl_toolkits: on Debian, it's available as python-mpltoolkits.basemap, or you can find Basemap on GitHub.

It's easiest to start with the fillstates.py example that shows how to draw a US map with different states colored differently. You'll need the three shapefiles (because of ESRI's silly shapefile format): st99_d00.dbf, st99_d00.shp and st99_d00.shx, available in the same examples directory.

Of course, to plot counties, you need county shapefiles as well. The US Census has county shapefiles at several different resolutions (I used the 500k version). Then you can plot state and counties outlines like this:

from mpl_toolkits.basemap import Basemap
import matplotlib.pyplot as plt

def draw_us_map():
    # Set the lower left and upper right limits of the bounding box:
    lllon = -119
    urlon = -64
    lllat = 22.0
    urlat = 50.5
    # and calculate a centerpoint, needed for the projection:
    centerlon = float(lllon + urlon) / 2.0
    centerlat = float(lllat + urlat) / 2.0

    m = Basemap(resolution='i',  # crude, low, intermediate, high, full
                llcrnrlon = lllon, urcrnrlon = urlon,
                lon_0 = centerlon,
                llcrnrlat = lllat, urcrnrlat = urlat,
                lat_0 = centerlat,
                projection='tmerc')

    # Read state boundaries.
    shp_info = m.readshapefile('st99_d00', 'states',
                               drawbounds=True, color='lightgrey')

    # Read county boundaries
    shp_info = m.readshapefile('cb_2015_us_county_500k',
                               'counties',
                               drawbounds=True)

if __name__ == "__main__":
    draw_us_map()
    plt.title('US Counties')
    # Get rid of some of the extraneous whitespace matplotlib loves to use.
    plt.tight_layout(pad=0, w_pad=0, h_pad=0)
    plt.show()
[Simple map of US county borders]

Accessing the state and county data after reading shapefiles

Great. Now that we've plotted all the states and counties, how do we get a list of them, so that when I read out "Santa Clara, CA" from the data I'm trying to plot, I know which map object to color?

After calling readshapefile('st99_d00', 'states'), m has two new members, both lists: m.states and m.states_info.

m.states_info[] is a list of dicts mirroring what was in the shapefile. For the Census state list, the useful keys are NAME, AREA, and PERIMETER. There's also STATE, which is an integer (not restricted to 1 through 50) but I'll get to that.

If you want the shape for, say, California, iterate through m.states_info[] looking for the one where m.states_info[i]["NAME"] == "California". Note i; the shape coordinates will be in m.states[i]n (in basemap map coordinates, not latitude/longitude).

Correlating states and counties in Census shapefiles

County data is similar, with county names in m.counties_info[i]["NAME"]. Remember that STATE integer? Each county has a STATEFP, m.counties_info[i]["STATEFP"] that matches some state's m.states_info[i]["STATE"].

But doing that search every time would be slow. So right after calling readshapefile for the states, I make a table of states. Empirically, STATE in the state list goes up to 72. Why 72? Shrug.

    MAXSTATEFP = 73
    states = [None] * MAXSTATEFP
    for state in m.states_info:
        statefp = int(state["STATE"])
        # Many states have multiple entries in m.states (because of islands).
        # Only add it once.
        if not states[statefp]:
            states[statefp] = state["NAME"]

That'll make it easy to look up a county's state name quickly when we're looping through all the counties.

Calculating colors for each county

Time to figure out the colors from the Deleetdk election results CSV file. Reading lines from the CSV file into a dictionary is superficially easy enough:

    fp = open("tidy_data.csv")
    reader = csv.DictReader(fp)

    # Make a dictionary of all "county, state" and their colors.
    county_colors = {}
    for county in reader:
        # What color is this county?
        pop = float(county["votes"])
        blue = float(county["results.clintonh"])/pop
        red = float(county["Total.Population"])/pop
        county_colors["%s, %s" % (county["name"], county["State"])] \
            = (red, 0, blue)

But in practice, that wasn't good enough, because the county names in the Deleetdk names didn't always match the official Census county names.

Fuzzy matches

For instance, the CSV file had no results for Alaska or Puerto Rico, so I had to skip those. Non-ASCII characters were a problem: "Doña Ana" county in the census data was "Dona Ana" in the CSV. I had to strip off " County", " Borough" and similar terms: "St Louis" in the census data was "St. Louis County" in the CSV. Some names were capitalized differently, like PLYMOUTH vs. Plymouth, or Lac Qui Parle vs. Lac qui Parle. And some names were just different, like "Jeff Davis" vs. "Jefferson Davis".

To get around that I used SequenceMatcher to look for fuzzy matches when I couldn't find an exact match:

def fuzzy_find(s, slist):
    '''Try to find a fuzzy match for s in slist.
    '''
    best_ratio = -1
    best_match = None

    ls = s.lower()
    for ss in slist:
        r = SequenceMatcher(None, ls, ss.lower()).ratio()
        if r > best_ratio:
            best_ratio = r
            best_match = ss
    if best_ratio > .75:
        return best_match
    return None

Correlate the county names from the two datasets

It's finally time to loop through the counties in the map to color and plot them.

Remember STATE vs. STATEFP? It turns out there are a few counties in the census county shapefile with a STATEFP that doesn't match any STATE in the state shapefile. Mostly they're in the Virgin Islands and I don't have election data for them anyway, so I skipped them for now. I also skipped Puerto Rico and Alaska (no results in the election data) and counties that had no corresponding state: I'll omit that code here, but you can see it in the final script, linked at the end.

    for i, county in enumerate(m.counties_info):
        countyname = county["NAME"]
        try:
            statename = states[int(county["STATEFP"])]
        except IndexError:
            print countyname, "has out-of-index statefp of", county["STATEFP"]
            continue

        countystate = "%s, %s" % (countyname, statename)
        try:
            ccolor = county_colors[countystate]
        except KeyError:
            # No exact match; try for a fuzzy match
            fuzzyname = fuzzy_find(countystate, county_colors.keys())
            if fuzzyname:
                ccolor = county_colors[fuzzyname]
                county_colors[countystate] = ccolor
            else:
                print "No match for", countystate
                continue

        countyseg = m.counties[i]
        poly = Polygon(countyseg, facecolor=ccolor)  # edgecolor="white"
        ax.add_patch(poly)

Moving Hawaii

Finally, although the CSV didn't have results for Alaska, it did have Hawaii. To display it, you can move it when creating the patches:

    countyseg = m.counties[i]
    if statename == 'Hawaii':
        countyseg = list(map(lambda (x,y): (x + 5750000, y-1400000), countyseg))
    poly = Polygon(countyseg, facecolor=countycolor)
    ax.add_patch(poly)
The offsets are in map coordinates and are empirical; I fiddled with them until Hawaii showed up at a reasonable place. [Blue-red-purple 2016 election map]

Well, that was a much longer article than I intended. Turns out it takes a fair amount of code to correlate several datasets and turn them into a map. But a lot of the work will be applicable to other datasets.

Full script on GitHub: Blue-red map using Census county shapefile

Tags: , , , , , ,
[ 15:10 Jan 14, 2017    More programming | permalink to this entry | comments ]

Thu, 22 Dec 2016

Tips on Developing Python Projects for PyPI

I wrote two recent articles on Python packaging: Distributing Python Packages Part I: Creating a Python Package and Distributing Python Packages Part II: Submitting to PyPI. I was able to get a couple of my programs packaged and submitted.

Ongoing Development and Testing

But then I realized all was not quite right. I could install new releases of my package -- but I couldn't run it from the source directory any more. How could I test changes without needing to rebuild the package for every little change I made?

Fortunately, it turned out to be fairly easy. Set PYTHONPATH to a directory that includes all the modules you normally want to test. For example, inside my bin directory I have a python directory where I can symlink any development modules I might need:

mkdir ~/bin/python
ln -s ~/src/metapho/metapho ~/bin/python/

Then add the directory at the beginning of PYTHONPATH:

export PYTHONPATH=$HOME/bin/python

With that, I could test from the development directory again, without needing to rebuild and install a package every time.

Cleaning up files used in building

Building a package leaves some extra files and directories around, and git status will whine at you since they're not version controlled. Of course, you could gitignore them, but it's better to clean them up after you no longer need them.

To do that, you can add a clean command to setup.py.

from setuptools import Command

class CleanCommand(Command):
    """Custom clean command to tidy up the project root."""
    user_options = []
    def initialize_options(self):
        pass
    def finalize_options(self):
        pass
    def run(self):
        os.system('rm -vrf ./build ./dist ./*.pyc ./*.tgz ./*.egg-info ./docs/sphinxdoc/_build')
(Obviously, that includes file types beyond what you need for just cleaning up after package building. Adjust the list as needed.)

Then in the setup() function, add these lines:

      cmdclass={
          'clean': CleanCommand,
      }

Now you can type

python setup.py clean
and it will remove all the extra files.

Keeping version strings in sync

It's so easy to update the __version__ string in your module and forget that you also have to do it in setup.py, or vice versa. Much better to make sure they're always in sync.

I found several version of that using system("grep..."), but I decided to write my own that doesn't depend on system(). (Yes, I should do the same thing with that CleanCommand, I know.)

def get_version():
    '''Read the pytopo module versions from pytopo/__init__.py'''
    with open("pytopo/__init__.py") as fp:
        for line in fp:
            line = line.strip()
            if line.startswith("__version__"):
                parts = line.split("=")
                if len(parts) > 1:
                    return parts[1].strip()

Then in setup():

      version=get_version(),

Much better! Now you only have to update __version__ inside your module and setup.py will automatically use it.

Using your README for a package long description

setup has a long_description for the package, but you probably already have some sort of README in your package. You can use it for your long description this way:

# Utility function to read the README file.
# Used for the long_description.
def read(fname):
    return open(os.path.join(os.path.dirname(__file__), fname)).read()
    long_description=read('README'),

Tags: , ,
[ 10:15 Dec 22, 2016    More programming | permalink to this entry | comments ]

Sat, 17 Dec 2016

Distributing Python Packages Part II: Submitting to PyPI

In Part I, I discussed writing a setup.py to make a package you can submit to PyPI. Today I'll talk about better ways of testing the package, and how to submit it so other people can install it.

Testing in a VirtualEnv

You've verified that your package installs. But you still need to test it and make sure it works in a clean environment, without all your developer settings.

The best way to test is to set up a "virtual environment", where you can install your test packages without messing up your regular runtime environment. I shied away from virtualenvs for a long time, but they're actually very easy to set up:

virtualenv venv
source venv/bin/activate

That creates a directory named venv under the current directory, which it will use to install packages. Then you can pip install packagename or pip install /path/to/packagename-version.tar.gz

Except -- hold on! Nothing in Python packaging is that easy. It turns out there are a lot of packages that won't install inside a virtualenv, and one of them is PyGTK, the library I use for my user interfaces. Attempting to install pygtk inside a venv gets:

********************************************************************
* Building PyGTK using distutils is only supported on windows. *
* To build PyGTK in a supported way, read the INSTALL file.    *
********************************************************************

Windows only? Seriously? PyGTK works fine on both Linux and Mac; it's packaged on every Linux distribution, and on Mac it's packaged with GIMP. But for some reason, whoever maintains the PyPI PyGTK packages hasn't bothered to make it work on anything but Windows, and PyGTK seems to be mostly an orphaned project so that's not likely to change.

(There's a package called ruamel.venvgtk that's supposed to work around this, but it didn't make any difference for me.)

The solution is to let the virtualenv use your system-installed packages, so it can find GTK and other non-PyPI packages there:

virtualenv --system-site-packages venv
source venv/bin/activate

I also found that if I had a ~/.local directory (where packages normally go if I use pip install --user packagename), sometimes pip would install to .local instead of the venv. I never did track down why this happened some times and not others, but when it happened, a temporary mv ~/.local ~/old.local fixed it.

Test your Python package in the venv until everything works. When you're finished with your venv, you can run deactivate and then remove it with rm -rf venv.

Tag it on GitHub

Is your project ready to publish?

If your project is hosted on GitHub, you can have pypi download it automatically. In your setup.py, set

download_url='https://github.com/user/package/tarball/tagname',

Check that in. Then make a tag and push it:

git tag 0.1 -m "Name for this tag"
git push --tags origin master

Try to make your tag match the version you've set in setup.py and in your module.

Push it to pypitest

Register a new account and password on both pypitest and on pypi.

Then create a ~/.pypirc that looks like this:

[distutils]
index-servers =
  pypi
  pypitest

[pypi]
repository=https://pypi.python.org/pypi
username=YOUR_USERNAME
password=YOUR_PASSWORD

[pypitest]
repository=https://testpypi.python.org/pypi
username=YOUR_USERNAME
password=YOUR_PASSWORD

Yes, those passwords are in cleartext. Incredibly, there doesn't seem to be a way to store an encrypted password or even have it prompt you. There are tons of complaints about that all over the web but nobody seems to have a solution. You can specify a password on the command line, but that's not much better. So use a password you don't use anywhere else and don't mind too much if someone guesses.

Update: Apparently there's a newer method called twine that solves the password encryption problem. Read about it here: Uploading your project to PyPI. You should probably use twine instead of the setup.py commands discussed in the next paragraph.

Now register your project and upload it:

python setup.py register -r pypitest
python setup.py sdist upload -r pypitest

Wait a few minutes: it takes pypitest a little while before new packages become available. Then go to your venv (to be safe, maybe delete the old venv and create a new one, or at least pip uninstall) and try installing:

pip install -i https://testpypi.python.org/pypi YourPackageName

If you get "No matching distribution found for packagename", wait a few minutes then try again.

If it all works, then you're ready to submit to the real pypi:

python setup.py register -r pypi
python setup.py sdist upload -r pypi

Congratulations! If you've gone through all these steps, you've uploaded a package to pypi. Pat yourself on the back and go tell everybody they can pip install your package.

Some useful reading

Some pages I found useful:

A great tutorial except that it forgets to mention signing up for an account: Python Packaging with GitHub

Another good tutorial: First time with PyPI

Allowed PyPI classifiers -- the categories your project fits into Unfortunately there aren't very many of those, so you'll probably be stuck with 'Topic :: Utilities' and not much else.

Python Packages and You: not a tutorial, but a lot of good advice on style and designing good packages.

Tags: , ,
[ 16:19 Dec 17, 2016    More programming | permalink to this entry | comments ]

Sun, 11 Dec 2016

Distributing Python Packages Part I: Creating a Python Package

I write lots of Python scripts that I think would be useful to other people, but I've put off learning how to submit to the Python Package Index, PyPI, so that my packages can be installed using pip install.

Now that I've finally done it, I see why I put it off for so long. Unlike programming in Python, packaging is a huge, poorly documented hassle, and it took me days to get a working.package. Maybe some of the hints here will help other struggling Pythonistas.

Create a setup.py

The setup.py file is the file that describes the files in your project and other installation information. If you've never created a setup.py before, Submitting a Python package with GitHub and PyPI has a decent example, and you can find lots more good examples with a web search for "setup.py", so I'll skip the basics and just mention some of the parts that weren't straightforward.

Distutils vs. Setuptools

However, there's one confusing point that no one seems to mention. setup.py examples all rely on a predefined function called setup, but some examples start with

from distutils.core import setup
while others start with
from setuptools import setup

In other words, there are two different versions of setup! What's the difference? I still have no idea. The setuptools version seems to be a bit more advanced, and I found that using distutils.core , sometimes I'd get weird errors when trying to follow suggestions I found on the web. So I ended up using the setuptools version.

But I didn't initially have setuptools installed (it's not part of the standard Python distribution), so I installed it from the Debian package:

apt-get install python-setuptools python-wheel

The python-wheel package isn't strictly needed, but I found I got assorted warnings warnings from pip install later in the process ("Cannot build wheel") unless I installed it, so I recommend you install it from the start.

Including scripts

setup.py has a scripts option to include scripts that are part of your package:

    scripts=['script1', 'script2'],

But when I tried to use it, I had all sorts of problems, starting with scripts not actually being included in the source distribution. There isn't much support for using scripts -- it turns out you're actually supposed to use something called console_scripts, which is more elaborate.

First, you can't have a separate script file, or even a __main__ inside an existing class file. You must have a function, typically called main(), so you'll typically have this:

def main():
    # do your script stuff

if __name__ == "__main__":
    main()

Then add something like this to your setup.py:

      entry_points={
          'console_scripts': [
              script1=yourpackage.filename:main',
              script2=yourpackage.filename2:main'
          ]
      },

There's a secret undocumented alternative that a few people use for scripts with graphical user interfaces: use 'gui_scripts' rather than 'console_scripts'. It seems to work when I try it, but the fact that it's not documented and none of the Python experts even seem to know about it scared me off, and I stuck with 'console_scripts'.

Including data files

One of my packages, pytopo, has a couple of files it needs to install, like an icon image. setup.py has a provision for that:

      data_files=[('/usr/share/pixmaps',      ["resources/appname.png"]),
                  ('/usr/share/applications', ["resources/appname.desktop"]),
                  ('/usr/share/appname',      ["resources/pin.png"]),
                 ],

Great -- except it doesn't work. None of the files actually gets added to the source distribution.

One solution people mention to a "files not getting added" problem is to create an explicit MANIFEST file listing all files that need to be in the distribution. Normally, setup generates the MANIFEST automatically, but apparently it isn't smart enough to notice data_files and include those in its generated MANIFEST.

I tried creating a MANIFEST listing all the .py files plus the various resources -- but it didn't make any difference. My MANIFEST was ignored.

The solution turned out to be creating a MANIFEST.in file, which is used to generate a MANIFEST. It's easier than creating the MANIFEST itself: you don't have to list every file, just patterns that describe them:

include setup.py
include packagename/*.py
include resources/*
If you have any scripts that don't use the extension .py, don't forget to include them as well. This may have been why scripts= didn't work for me earlier, but by the time I found out about MANIFEST.in I had already switched to using console_scripts.

Testing setup.py

Once you have a setup.py, use it to generate a source distribution with:

python setup.py sdist
(You can also use bdist to generate a binary distribution, but you'll probably only need that if you're compiling C as part of your package. Source dists are apparently enough for pure Python packages.)

Your package will end up in dist/packagename-version.tar.gz so you can use tar tf dist/packagename-version.tar.gz to verify what files are in it. Work on your setup.py until you don't get any errors or warnings and the list of files looks right.

Congratulations -- you've made a Python package! I'll post a followup article in a day or two about more ways of testing, and how to submit your working package to PyPI.

Update: Part II is up: Distributing Python Packages Part II: Submitting to PyPI.

Tags: , ,
[ 12:54 Dec 11, 2016    More programming | permalink to this entry | comments ]

Wed, 05 Oct 2016

Play notes, chords and arbitrary waveforms from Python

Reading Stephen Wolfram's latest discussion of teaching computational thinking (which, though I mostly agree with it, is more an extended ad for Wolfram Programming Lab than a discussion of what computational thinking is and why we should teach it) I found myself musing over ideas for future computer classes for Los Alamos Makers. Students, and especially kids, like to see something other than words on a screen. Graphics and games good, or robotics when possible ... but another fun project a novice programmer can appreciate is music.

I found myself curious what you could do with Python, since I hadn't played much with Python sound generation libraries. I did discover a while ago that Python is rather bad at playing audio files, though I did eventually manage to write a music player script that works quite well. What about generating tones and chords?

A web search revealed that this is another thing Python is bad at. I found lots of people asking about chord generation, and a handful of half-baked ideas that relied on long obsolete packages or external program. But none of it actually worked, at least without requiring Windows or relying on larger packages like fluidsynth (which looked worth exploring some day when I have more time).

Play an arbitrary waveform with Pygame and NumPy

But I did find one example based on a long-obsolete Python package called Numeric which, when rewritten to use NumPy, actually played a sound. You can take a NumPy array and play it using a pygame.sndarray object this way:

import pygame, pygame.sndarray

def play_for(sample_wave, ms):
    """Play the given NumPy array, as a sound, for ms milliseconds."""
    sound = pygame.sndarray.make_sound(sample_wave)
    sound.play(-1)
    pygame.time.delay(ms)
    sound.stop()

Then you just need to calculate the waveform you want to play. NumPy can generate sine waves on its own, while scipy.signal can generate square and sawtooth waves. Like this:

import numpy
import scipy.signal

sample_rate = 44100

def sine_wave(hz, peak, n_samples=sample_rate):
    """Compute N samples of a sine wave with given frequency and peak amplitude.
       Defaults to one second.
    """
    length = sample_rate / float(hz)
    omega = numpy.pi * 2 / length
    xvalues = numpy.arange(int(length)) * omega
    onecycle = peak * numpy.sin(xvalues)
    return numpy.resize(onecycle, (n_samples,)).astype(numpy.int16)

def square_wave(hz, peak, duty_cycle=.5, n_samples=sample_rate):
    """Compute N samples of a sine wave with given frequency and peak amplitude.
       Defaults to one second.
    """
    t = numpy.linspace(0, 1, 500 * 440/hz, endpoint=False)
    wave = scipy.signal.square(2 * numpy.pi * 5 * t, duty=duty_cycle)
    wave = numpy.resize(wave, (n_samples,))
    return (peak / 2 * wave.astype(numpy.int16))

# Play A (440Hz) for 1 second as a sine wave:
play_for(sine_wave(440, 4096), 1000)

# Play A-440 for 1 second as a square wave:
play_for(square_wave(440, 4096), 1000)

Playing chords

That's all very well, but it's still a single tone, not a chord.

To generate a chord of two notes, you can add the waveforms for the two notes. For instance, 440Hz is concert A, and the A one octave above it is double the frequence, or 880 Hz. If you wanted to play a chord consisting of those two As, you could do it like this:

play_for(sum([sine_wave(440, 4096), sine_wave(880, 4096)]), 1000)

Simple octaves aren't very interesting to listen to. What you want is chords like major and minor triads and so forth. If you google for chord ratios Google helpfully gives you a few of them right off, then links to a page with a table of ratios for some common chords.

For instance, the major triad ratios are listed as 4:5:6. What does that mean? It means that for a C-E-G triad (the first C chord you learn in piano), the E's frequency is 5/4 of the C's frequency, and the G is 6/4 of the C.

You can pass that list, [4, 5, 5] to a function that will calculate the right ratios to produce the set of waveforms you need to add to get your chord:

def make_chord(hz, ratios):
    """Make a chord based on a list of frequency ratios."""
    sampling = 4096
    chord = waveform(hz, sampling)
    for r in ratios[1:]:
        chord = sum([chord, sine_wave(hz * r / ratios[0], sampling)])
    return chord

def major_triad(hz):
    return make_chord(hz, [4, 5, 6])

play_for(major_triad(440), length)

Even better, you can pass in the waveform you want to use when you're adding instruments together:

def make_chord(hz, ratios, waveform=None):
    """Make a chord based on a list of frequency ratios
       using a given waveform (defaults to a sine wave).
    """
    sampling = 4096
    if not waveform:
        waveform = sine_wave
    chord = waveform(hz, sampling)
    for r in ratios[1:]:
        chord = sum([chord, waveform(hz * r / ratios[0], sampling)])
    return chord

def major_triad(hz, waveform=None):
    return make_chord(hz, [4, 5, 6], waveform)

play_for(major_triad(440, square_wave), length)

There are still some problems. For instance, sawtooth_wave() works fine individually or for pairs of notes, but triads of sawtooths don't play correctly. I'm guessing something about the sampling rate is making their overtones cancel out part of the sawtooth wave. Triangle waves (in scipy.signal, that's a sawtooth wave with rising ramp width of 0.5) don't seem to work right even for single tones. I'm sure these are solvable, perhaps by fiddling with the sampling rate. I'll probably need to add graphics so I can look at the waveform for debugging purposes.

In any case, it was a fun morning hack. Most chords work pretty well, and it's nice to know how to to play any waveform I can generate.

The full script is here: play_chord.py on GitHub.

Tags: , ,
[ 11:29 Oct 05, 2016    More programming | permalink to this entry | comments ]

Sat, 06 Aug 2016

Adding a Back button in Python Webkit-GTK

I have a little browser script in Python, called quickbrowse, based on Python-Webkit-GTK. I use it for things like quickly calling up an anonymous window with full javascript and cookies, for when I hit a page that doesn't work with Firefox and privacy blocking; and as a quick solution for calling up HTML conversions of doc and pdf email attachments.

Python-webkit comes with a simple browser as an example -- on Debian it's installed in /usr/share/doc/python-webkit/examples/browser.py. But it's very minimal, and lacks important basic features like command-line arguments. One of those basic features I've been meaning to add is Back and Forward buttons.

Should be easy, right? Of course webkit has a go_back() method, so I just have to add a button and call that, right? Ha. It turned out to be a lot more difficult than I expected, and although I found a fair number of pages asking about it, I didn't find many working examples. So here's how to do it.

Add a toolbar button

In the WebToolbar class (derived from gtk.Toolbar): In __init__(), after initializing the parent class and before creating the location text entry (assuming you want your buttons left of the location bar), create the two buttons:

        backButton = gtk.ToolButton(gtk.STOCK_GO_BACK)
        backButton.connect("clicked", self.back_cb)
        self.insert(backButton, -1)
        backButton.show()

        forwardButton = gtk.ToolButton(gtk.STOCK_GO_FORWARD)
        forwardButton.connect("clicked", self.forward_cb)
        self.insert(forwardButton, -1)
        forwardButton.show()

Now create those callbacks you just referenced:

   def back_cb(self, w):
        self.emit("go-back-requested")

    def forward_cb(self, w):
        self.emit("go-forward-requested")

That's right, you can't just call go_back on the web view, because GtkToolbar doesn't know anything about the window containing it. All it can do is pass signals up the chain.

But wait -- it can't even pass signals unless you define them. There's a __gsignals__ object defined at the beginning of the class that needs all its signals spelled out. In this case, what you need is

       "go-back-requested": (gobject.SIGNAL_RUN_FIRST,
                              gobject.TYPE_NONE, ()),
       "go-forward-requested": (gobject.SIGNAL_RUN_FIRST,
                              gobject.TYPE_NONE, ()),
Now these signals will bubble up to the window containing the toolbar.

Handle the signals in the containing window

So now you have to handle those signals in the window. In WebBrowserWindow (derived from gtk.Window), in __init__ after creating the toolbar:

        toolbar.connect("go-back-requested", self.go_back_requested_cb,
                        self.content_tabs)
        toolbar.connect("go-forward-requested", self.go_forward_requested_cb,
                        self.content_tabs)

And then of course you have to define those callbacks:

def go_back_requested_cb (self, widget, content_pane):
    # Oops! What goes here?
def go_forward_requested_cb (self, widget, content_pane):
    # Oops! What goes here?

But whoops! What do we put there? It turns out that WebBrowserWindow has no better idea than WebToolbar did of where its content is or how to tell it to go back or forward. What it does have is a ContentPane (derived from gtk.Notebook), which is basically just a container with no exposed methods that have anything to do with web browsing.

Get the BrowserView for the current tab

Fortunately we can fix that. In ContentPane, you can get the current page (meaning the current browser tab, in this case); and each page has a child, which turns out to be a BrowserView. So you can add this function to ContentPane to help other classes get the current BrowserView:

    def current_view(self):
        return self.get_nth_page(self.get_current_page()).get_child()

And now, using that, we can define those callbacks in WebBrowserWindow:

def go_back_requested_cb (self, widget, content_pane):
    content_pane.current_view().go_back()
def go_forward_requested_cb (self, widget, content_pane):
    content_pane.current_view().go_forward()

Whew! That's a lot of steps for something I thought was going to be just adding two buttons and two callbacks.

Tags: , , ,
[ 16:45 Aug 06, 2016    More programming | permalink to this entry | comments ]

Tue, 05 Apr 2016

Modifying a git repo so you can pull without a password

There's been a discussion in the GIMP community about setting up git repos to host contributed assets like scripts, plug-ins and brushes, to replace the long-stagnant GIMP Plug-in Repository. One of the suggestions involves having lots of tiny git repos rather than one that holds all the assets.

That got me to thinking about one annoyance I always have when setting up a new git repository on github: the repository is initially configured with an ssh URL, so I can push to it; but that means I can't pull from the repo without typing my ssh password (more accurately, the password to my ssh key).

Fortunately, there's a way to fix that: a git configuration can have one url for pulling source, and a different pushurl for pushing changes.

These are defined in the file .git/config inside each repository. So edit that file and take a look at the [remote "origin"] section.

For instance, in the GIMP source repositories, hosted on git.gnome.org, instead of the default of url = ssh://git.gnome.org/git/gimp I can set

pushurl = ssh://git.gnome.org/git/gimp
url = git://git.gnome.org/gimp
(disclaimer: I'm not sure this is still correct; my gnome git access stopped working -- I think it was during the Heartbleed security fire drill, or one of those -- and never got fixed.)

For GitHub the syntax is a little different. When I initially set up a repository, the url comes out something like url = git@github.com:username/reponame.git (sometimes the git@ part isn't included), and the password-free pull URL is something you can get from github's website. So you'll end up with something like this:

pushurl = git@github.com:username/reponame.git
url = https://github.com/username/reponame.git

Automating it

That's helpful, and I've made that change on all of my repos. But I just forked another repo on github, and as I went to edit .git/config I remembered what a pain this had been to do en masse on all my repos; and how it would be a much bigger pain to do it on a gazillion tiny GIMP asset repos if they end up going with that model and I ever want to help with the development. It's just the thing that should be scriptable.

However, the rules for what constitutes a valid git passwordless pull URL, and what constitutes a valid ssh writable URL, seem to encompass a lot of territory. So the quickie Python script I whipped up to modify .git/config doesn't claim to handle everything; it only handles the URLs I've encountered personally on Gnome and GitHub. Still, that should be useful if I ever have to add multiple repos at once. The script: repo-pullpush (yes, I know it's a terrible name) on GitHub.

Tags: , , ,
[ 12:28 Apr 05, 2016    More programming | permalink to this entry | comments ]

Fri, 27 Nov 2015

Getting around make clean or make distclean aclocal failures

Keeping up with source trees for open source projects, it often happens that you pull the latest source, type make, and get an error like this (edited for brevity):

$ make
cd . && /bin/sh ./missing --run aclocal-1.14
missing: line 52: aclocal-1.14: command not found
WARNING: aclocal-1.14' is missing on your system. You should only need it if you modified acinclude.m4' or configure.ac'. You might want to install the Automake' and Perl' packages. Grab them from any GNU archive site.

What's happening is that make is set up to run ./autogen.sh (similar to running ./configure except it does some other stuff tailored to people who build from the most current source tree) automatically if anything has changed in the tree. But if the version of aclocal has changed since the last time you ran autogen.sh or configure, then running configure with the same arguments won't work.

Often, running a make distclean, to clean out all local configuration in your tree and start from scratch, will fix the problem. A simpler make clean might even be enough. But when you try it, you get the same aclocal error.

Whoops! make clean runs make, which triggers the rule that configure has to run before make, which fails.

It would be nice if the make rules were smart enough to notice this and not require configure or autogen if the make target is something simple like clean or distclean. Alas, in most projects, they aren't.

But it turns out that even if you can't run autogen.sh with your usual arguments -- e.g. ./autogen.sh --prefix=/usr/local/gimp-git -- running ./autogen.sh by itself with no extra arguments will often fix the problem.

This happens to me often enough with the GIMP source tree that I made a shell alias for it:

alias distclean="./autogen.sh && ./configure && make clean"

Saving your configure arguments

Of course, this wipes out any arguments you've previously passed to autogen and configure. So assuming this succeeds, your very next action should be to run autogen again with the arguments you actually want to use, e.g.:

./autogen.sh --prefix=/usr/local/gimp-git

Before you ran the distclean, you could get those arguments by looking at the first few lines of config.log. But after you've run distclean, config.log is gone -- what if you forgot to save the arguments first? Or what if you just forget that you need to re-run autogen.sh again after your distclean?

To guard against that, I wrote a somewhat more complicated shell function to use instead of the simple alias I listed above.

The first trick is to get the arguments you previously passed to configure. You can parse them out of config.log:

$ egrep '^  \$ ./configure' config.log
  $ ./configure --prefix=/usr/local/gimp-git --enable-foo --disable-bar

Adding a bit of sed to strip off the beginning of the command, you could save the previously used arguments like this:

    args=$(egrep '^  \$ ./configure' config.log | sed 's_^  \$ ./configure __')

(There's a better place for getting those arguments, config.status -- but parsing them from there is a bit more complicated, so I'll follow up with a separate article on that, chock-full of zsh goodness.)

So here's the distclean shell function, written for zsh:

distclean() {
    setopt localoptions errreturn

    args=$(egrep '^  \$ ./configure' config.log | sed 's_^  \$ ./configure __')
    echo "Saved args:" $args
    ./autogen.sh
    ./configure
    make clean

    echo
    echo "==========================="
    echo "Running ./autogen.sh $args"
    sleep 3
    ./autogen.sh $args
}

The setopt localoptions errreturn at the beginning is a zsh-ism that tells the shell to exit if there's an error. You don't want to forge ahead and run configure and make clean if your autogen.sh didn't work right. errreturn does much the same thing as the && between the commands in the simpler shell alias above, but with cleaner syntax.

If you're using bash, you could string all the commands on one line instead, with && between them, something like this: ./autogen.sh && ./configure && make clean && ./autogen.sh $args Or perhaps some bash user will tell me of a better way.

Tags: , , ,
[ 13:33 Nov 27, 2015    More programming | permalink to this entry | comments ]