Shallow Thoughts

Akkana's Musings on Open Source Computing and Technology, Science, and Nature.

Sat, 14 Jan 2017

Plotting election (and other county-level) data with Python Basemap

After my arduous search for open 2016 election data by county, as a first test I wanted one of those red-blue-purple charts of how Democratic or Republican each county's vote was.

I used the Basemap package for plotting. It used to be part of matplotlib, but it's been split off into its own toolkit, grouped under mpl_toolkits: on Debian, it's available as python-mpltoolkits.basemap, or you can find Basemap on GitHub.

It's easiest to start with the fillstates.py example that shows how to draw a US map with different states colored differently. You'll need the three shapefiles (because of ESRI's silly shapefile format): st99_d00.dbf, st99_d00.shp and st99_d00.shx, available in the same examples directory.

Of course, to plot counties, you need county shapefiles as well. The US Census has county shapefiles at several different resolutions (I used the 500k version). Then you can plot state and counties outlines like this:

from mpl_toolkits.basemap import Basemap
import matplotlib.pyplot as plt

def draw_us_map():
    # Set the lower left and upper right limits of the bounding box:
    lllon = -119
    urlon = -64
    lllat = 22.0
    urlat = 50.5
    # and calculate a centerpoint, needed for the projection:
    centerlon = float(lllon + urlon) / 2.0
    centerlat = float(lllat + urlat) / 2.0

    m = Basemap(resolution='i',  # crude, low, intermediate, high, full
                llcrnrlon = lllon, urcrnrlon = urlon,
                lon_0 = centerlon,
                llcrnrlat = lllat, urcrnrlat = urlat,
                lat_0 = centerlat,
                projection='tmerc')

    # Read state boundaries.
    shp_info = m.readshapefile('st99_d00', 'states',
                               drawbounds=True, color='lightgrey')

    # Read county boundaries
    shp_info = m.readshapefile('cb_2015_us_county_500k',
                               'counties',
                               drawbounds=True)

if __name__ == "__main__":
    draw_us_map()
    plt.title('US Counties')
    # Get rid of some of the extraneous whitespace matplotlib loves to use.
    plt.tight_layout(pad=0, w_pad=0, h_pad=0)
    plt.show()
[Simple map of US county borders]

Accessing the state and county data after reading shapefiles

Great. Now that we've plotted all the states and counties, how do we get a list of them, so that when I read out "Santa Clara, CA" from the data I'm trying to plot, I know which map object to color?

After calling readshapefile('st99_d00', 'states'), m has two new members, both lists: m.states and m.states_info.

m.states_info[] is a list of dicts mirroring what was in the shapefile. For the Census state list, the useful keys are NAME, AREA, and PERIMETER. There's also STATE, which is an integer (not restricted to 1 through 50) but I'll get to that.

If you want the shape for, say, California, iterate through m.states_info[] looking for the one where m.states_info[i]["NAME"] == "California". Note i; the shape coordinates will be in m.states[i]n (in basemap map coordinates, not latitude/longitude).

Correlating states and counties in Census shapefiles

County data is similar, with county names in m.counties_info[i]["NAME"]. Remember that STATE integer? Each county has a STATEFP, m.counties_info[i]["STATEFP"] that matches some state's m.states_info[i]["STATE"].

But doing that search every time would be slow. So right after calling readshapefile for the states, I make a table of states. Empirically, STATE in the state list goes up to 72. Why 72? Shrug.

    MAXSTATEFP = 73
    states = [None] * MAXSTATEFP
    for state in m.states_info:
        statefp = int(state["STATE"])
        # Many states have multiple entries in m.states (because of islands).
        # Only add it once.
        if not states[statefp]:
            states[statefp] = state["NAME"]

That'll make it easy to look up a county's state name quickly when we're looping through all the counties.

Calculating colors for each county

Time to figure out the colors from the Deleetdk election results CSV file. Reading lines from the CSV file into a dictionary is superficially easy enough:

    fp = open("tidy_data.csv")
    reader = csv.DictReader(fp)

    # Make a dictionary of all "county, state" and their colors.
    county_colors = {}
    for county in reader:
        # What color is this county?
        pop = float(county["votes"])
        blue = float(county["results.clintonh"])/pop
        red = float(county["Total.Population"])/pop
        county_colors["%s, %s" % (county["name"], county["State"])] \
            = (red, 0, blue)

But in practice, that wasn't good enough, because the county names in the Deleetdk names didn't always match the official Census county names.

Fuzzy matches

For instance, the CSV file had no results for Alaska or Puerto Rico, so I had to skip those. Non-ASCII characters were a problem: "Doña Ana" county in the census data was "Dona Ana" in the CSV. I had to strip off " County", " Borough" and similar terms: "St Louis" in the census data was "St. Louis County" in the CSV. Some names were capitalized differently, like PLYMOUTH vs. Plymouth, or Lac Qui Parle vs. Lac qui Parle. And some names were just different, like "Jeff Davis" vs. "Jefferson Davis".

To get around that I used SequenceMatcher to look for fuzzy matches when I couldn't find an exact match:

def fuzzy_find(s, slist):
    '''Try to find a fuzzy match for s in slist.
    '''
    best_ratio = -1
    best_match = None

    ls = s.lower()
    for ss in slist:
        r = SequenceMatcher(None, ls, ss.lower()).ratio()
        if r > best_ratio:
            best_ratio = r
            best_match = ss
    if best_ratio > .75:
        return best_match
    return None

Correlate the county names from the two datasets

It's finally time to loop through the counties in the map to color and plot them.

Remember STATE vs. STATEFP? It turns out there are a few counties in the census county shapefile with a STATEFP that doesn't match any STATE in the state shapefile. Mostly they're in the Virgin Islands and I don't have election data for them anyway, so I skipped them for now. I also skipped Puerto Rico and Alaska (no results in the election data) and counties that had no corresponding state: I'll omit that code here, but you can see it in the final script, linked at the end.

    for i, county in enumerate(m.counties_info):
        countyname = county["NAME"]
        try:
            statename = states[int(county["STATEFP"])]
        except IndexError:
            print countyname, "has out-of-index statefp of", county["STATEFP"]
            continue

        countystate = "%s, %s" % (countyname, statename)
        try:
            ccolor = county_colors[countystate]
        except KeyError:
            # No exact match; try for a fuzzy match
            fuzzyname = fuzzy_find(countystate, county_colors.keys())
            if fuzzyname:
                ccolor = county_colors[fuzzyname]
                county_colors[countystate] = ccolor
            else:
                print "No match for", countystate
                continue

        countyseg = m.counties[i]
        poly = Polygon(countyseg, facecolor=ccolor)  # edgecolor="white"
        ax.add_patch(poly)

Moving Hawaii

Finally, although the CSV didn't have results for Alaska, it did have Hawaii. To display it, you can move it when creating the patches:

    countyseg = m.counties[i]
    if statename == 'Hawaii':
        countyseg = list(map(lambda (x,y): (x + 5750000, y-1400000), countyseg))
    poly = Polygon(countyseg, facecolor=countycolor)
    ax.add_patch(poly)
The offsets are in map coordinates and are empirical; I fiddled with them until Hawaii showed up at a reasonable place. [Blue-red-purple 2016 election map]

Well, that was a much longer article than I intended. Turns out it takes a fair amount of code to correlate several datasets and turn them into a map. But a lot of the work will be applicable to other datasets.

Full script on GitHub: Blue-red map using Census county shapefile

Tags: , , , , , ,
[ 15:10 Jan 14, 2017    More programming | permalink to this entry | comments ]

Thu, 12 Jan 2017

Getting Election Data, and Why Open Data is Important

Back in 2012, I got interested in fiddling around with election data as a way to learn about data analysis in Python. So I went searching for results data on the presidential election. And got a surprise: it wasn't available anywhere in the US. After many hours of searching, the only source I ever found was at the UK newspaper, The Guardian.

Surely in 2016, we're better off, right? But when I went looking, I found otherwise. There's still no official source for US election results data; there isn't even a source as reliable as The Guardian this time.

You might think Data.gov would be the place to go for official election results, but no: searching for 2016 election on Data.gov yields nothing remotely useful.

The Federal Election Commission has an election results page, but it only goes up to 2014 and only includes the Senate and House, not presidential elections. Archives.gov has popular vote totals for the 2012 election but not the current one. Maybe in four years, they'll have some data.

After striking out on official US government sites, I searched the web. I found a few sources, none of them even remotely official.

Early on I found Simon Rogers, How to Download County-Level Results Data, which leads to GitHub user tonmcg's County Level Election Results 12-16. It's a comparison of Democratic vs. Republican votes in the 2012 and 2016 elections (I assume that means votes for that party's presidential candidate, though the field names don't make that entirely clear), with no information on third-party candidates.

KidPixo's Presidential Election USA 2016 on GitHub is a little better: the fields make it clear that it's recording votes for Trump and Clinton, but still no third party information. It's also scraped from the New York Times, and it includes the scraping code so you can check it and have some confidence on the source of the data.

Kaggle claims to have election data, but you can't download their datasets or even see what they have without signing up for an account. Ben Hamner has some publically available Kaggle data on GitHub, but only for the primary. I also found several companies selling election data, and several universities that had datasets available for researchers with accounts at that university.

The most complete dataset I found, and the only open one that included third party candidates, was through OpenDataSoft. Like the other two, this data is scraped from the NYT. It has data for all the minor party candidates as well as the majors, plus lots of demographic data for each county in the lower 48, plus Hawaii, but not the territories, and the election data for all the Alaska counties is missing.

You can get it either from a GitHub repo, Deleetdk's USA.county.data (look in inst/ext/tidy_data.csv. If you want a larger version with geographic shape data included, clicking through several other opendatasoft pages eventually gets you to an export page, USA 2016 Presidential Election by county, where you can download CSV, JSON, GeoJSON and other formats.

The OpenDataSoft data file was pretty useful, though it had gaps (for instance, there's no data for Alaska). I was able to make my own red-blue-purple plot of county voting results (I'll write separately about how to do that with python-basemap), and to play around with statistics.

Implications of the lack of open data

But the point my search really brought home: By the time I finally found a workable dataset, I was so sick of the search, and so relieved to find anything at all, that I'd stopped being picky about where the data came from. I had long since given up on finding anything from a known source, like a government site or even a newspaper, and was just looking for data, any data.

And that's not good. It means that a lot of the people doing statistics on elections are using data from unverified sources, probably copied from someone else who claimed to have scraped it, using unknown code, from some post-election web page that likely no longer exists. Is it accurate? There's no way of knowing.

What if someone wanted to spread news and misinformation? There's a hunger for data, particularly on something as important as a US Presidential election. Looking at Google's suggested results and "Searches related to" made it clear that it wasn't just me: there are a lot of people searching for this information and not being able to find it through official sources.

If I were a foreign power wanting to spread disinformation, providing easily available data files -- to fill the gap left by the US Government's refusal to do so -- would be a great way to mislead people. I could put anything I wanted in those files: there's no way of checking them against official results since there are no official results. Just make sure the totals add up to what people expect to see. You could easily set up an official-looking site and put made-up data there, and it would look a lot more real than all the people scraping from the NYT.

If our government -- or newspapers, or anyone else -- really wanted to combat "fake news", they should take open data seriously. They should make datasets for important issues like the presidential election publically available, as soon as possible after the election -- not four years later when nobody but historians care any more. Without that, we're leaving ourselves open to fake news and fake data.

Tags: , , , , ,
[ 16:41 Jan 12, 2017    More politics | permalink to this entry | comments ]

Sun, 08 Jan 2017

Snowy Winter Days, and an Elk Visit

[Snowy view of the Rio Grande from Overlook]

The snowy days here have been so pretty, the snow contrasting with the darkness of the piñons and junipers and the black basalt. The light fluffy crystals sparkle in a rainbow of colors when they catch the sunlight at the right angle, but I've been unable to catch that effect in a photo.

We've had some unusual holiday visitors, too, culminating in this morning's visit from a huge bull elk.

[bull elk in the yard] Dave came down to make coffee and saw the elk in the garden right next to the window. But by the time I saw him, he was farther out in the yard. And my DSLR batteries were dead, so I grabbed the point-and-shoot and got what I could through the window.

Fortunately for my photography the elk wasn't going anywhere in any hurry. He has an injured leg, and was limping badly. He slowly made his way down the hill and into the neighbors' yard. I hope he returns. Even with a limp that bad, an elk that size has no predators in White Rock, so as long as he stays off the nearby San Ildefonso reservation (where hunting is allowed) and manages to find enough food, he should be all right. I'm tempted to buy some hay to leave out for him.

[Sunset light on the Sangre de Cristos] Some of the sunsets have been pretty nice, too.

A few more photos.

Tags: ,
[ 19:48 Jan 08, 2017    More photo | permalink to this entry | comments ]

Using virtualenv to replace the broken pip install --user

Python's installation tool, pip, has some problems on Debian.

The obvious way to use pip is as root: sudo pip install packagename. If you hang out in Python groups at all, you'll quickly find that this is strongly frowned upon. It can lead to your pip-installed packages intermingling with the ones installed by Debian's apt-get, possibly causing problems during apt system updates.

The second most obvious way, as you'll see if you read pip's man page, is pip --user install packagename. This installs the package with only user permissions, not root, under a directory called ~/.local. Python automatically checks .local as part of its PYTHONPATH, and you can add ~/.local/bin to your PATH, so this makes everything transparent.

Or so I thought until recently, when I discovered that pip install --user ignores system-installed packages when it's calculating its dependencies, so you could end up with a bunch of incompatible versions of packages installed. Plus it takes forever to re-download and re-install dependencies you already had.

Pip has a clear page describing how pip --user is supposed to work, and that isn't what it's doing. So I filed pip bug 4222; but since pip has 687 open bugs filed against it, I'm not terrifically hopeful of that getting fixed any time soon. So I needed a workaround.

Use virtualenv instead of --user

Fortunately, it turned out that pip install works correctly in a virtualenv if you include the --system-site-packages option. I had thought virtualenvs were for testing, but quite a few people on #python said they used virtualenvs all the time, as part of their normal runtime environments. (Maybe due to pip's deficiencies?) I had heard people speak deprecatingly of --user in favor of virtualenvs but was never clear why; maybe this is why.

So, what I needed was to set up a virtualenv that I can keep around all the time and use by default every time I log in. I called it ~/.pythonenv when I created it:

virtualenv --system-site-packages $HOME/.pythonenv

Normally, the next thing you do after creating a virtualenv is to source a script called bin/activate inside the venv. That sets up your PATH, PYTHONPATH and a bunch of other variables so the venv will be used in all the right ways. But activate also changes your prompt, which I didn't want in my normal runtime environment. So I stuck this in my .zlogin file:

VIRTUAL_ENV_DISABLE_PROMPT=1 source $HOME/.pythonenv/bin/activate

Now I'll activate the venv once, when I log in (and once in every xterm window since I set XTerm*loginShell: true in my .Xdefaults. I see my normal prompt, I can use the normal Debian-installed Python packages, and I can install additional PyPI packages with pip install packagename (no --user, no sudo).

Tags:
[ 11:37 Jan 08, 2017    More | permalink to this entry | comments ]

Wed, 04 Jan 2017

Firefox "Reader Mode" and NoScript

A couple of days ago I blogged about using Firefox's "Delete Node" to make web pages more readable. In a subsequent Twitter discussion someone pointed out that if the goal is to make a web page's content clearer, Firefox's relatively new "Reader Mode" might be a better way.

I knew about Reader Mode but hadn't used it. It only shows up on some pages. as a little "open book" icon to the right of the URLbar just left of the Refresh/Stop button. It did show up on the Pogue Yahoo article; but when I clicked it, I just got a big blank page with an icon of a circle with a horizontal dash; no text.

It turns out that to see Reader Mode content in noscript, you must explicitly enable javascript from about:reader.

There are some reasons it's not automatically whitelisted: see discussions in bug 1158071 and bug 1166455 -- so enable it at your own risk. But it's nice to be able to use Reader Mode, and I'm glad the Twitter discussion spurred me to figure out why it wasn't working.

Tags: ,
[ 11:37 Jan 04, 2017    More tech/web | permalink to this entry | comments ]

Mon, 02 Jan 2017

Firefox's "Delete Node" eliminates pesky content-hiding banners

It's trendy among web designers today -- the kind who care more about showing ads than about the people reading their pages -- to use fixed banner elements that hide part of the page. In other words, you have a header, some content, and maybe a footer; and when you scroll the content to get to the next page, the header and footer stay in place, meaning that you can only read the few lines sandwiched in between them. But at least you can see the name of the site no matter how far you scroll down in the article! Wouldn't want to forget the site name!

Worse, many of these sites don't scroll properly. If you Page Down, the content moves a full page up, which means that the top of the new page is now hidden under that fixed banner and you have to scroll back up a few lines to continue reading where you left off. David Pogue wrote about that problem recently and it got a lot of play when Slashdot picked it up: These 18 big websites fail the space-bar scrolling test.

It's a little too bad he concentrated on the spacebar. Certainly it's good to point out that hitting the spacebar scrolls down -- I was flabbergasted to read the Slashdot discussion and discover that lots of people didn't already know that, since it's been my most common way of paging since browsers were invented. (Shift-space does a Page Up.) But the Slashdot discussion then veered off into a chorus of "I've never used the spacebar to scroll so why should anyone else care?", when the issue has nothing to do with the spacebar: the issue is that Page Down doesn't work right, whichever key you use to trigger that page down.

But never mind that. Fixed headers that don't scroll are bad even if the content scrolls the right amount, because it wastes precious vertical screen space on useless cruft you don't need. And I'm here to tell you that you can get rid of those annoying fixed headers, at least in Firefox.

[Article with intrusive Yahoo headers]

Let's take Pogue's article itself, since Yahoo is a perfect example of annoying content that covers the page and doesn't go away. First there's that enormous header -- the bottom row of menus ("Tech Home" and so forth) disappear once you scroll, but the rest stay there forever. Worse, there's that annoying popup on the bottom right ("Privacy | Terms" etc.) which blocks content, and although Yahoo! scrolls the right amount to account for the header, it doesn't account for that privacy bar, which continues to block most of the last line of every page.

The first step is to call up the DOM Inspector. Right-click on the thing you want to get rid of and choose Inspect Element:

[Right-click menu with Inspect Element]


That brings up the DOM Inspector window, which looks like this (click on the image for a full-sized view):

[DOM Inspector]

The upper left area shows the hierarchical structure of the web page.

Don't Panic! You don't have to know HTML or understand any of this for this technique to work.

Hover your mouse over the items in the hierarchy. Notice that as you hover, different parts of the web page are highlighted in translucent blue.

Generally, whatever element you started on will be a small part of the header you're trying to eliminate. Move up one line, to the element's parent; you may see that a bigger part of the header is highlighted. Move up again, and keep moving up, one line at a time, until the whole header is highlighted, as in the screenshot. There's also a dark grey window telling you something about the HTML, if you're interested; if you're not, don't worry about it.

Eventually you'll move up too far, and some other part of the page, or the whole page, will be highlighted. You need to find the element that makes the whole header blue, but nothing else.

Once you've found that element, right-click on it to get a context menu, and look for Delete Node (near the bottom of the menu). Clicking on that will delete the header from the page.

Repeat for any other part of the page you want to remove, like that annoying bar at the bottom right. And you're left with a nice, readable page, which will scroll properly and let you read every line, and will show you more text per page so you don't have to scroll as often.

[Article with intrusive Yahoo headers]

It's a useful trick. You can also use Inspect/Delete Node for many of those popups that cover the screen telling you "subscribe to our content!" It's especially handy if you like to browse with NoScript, so you can't dismiss those popups by clicking on an X. So happy reading!

Addendum on Spacebars

By the way, in case you weren't aware that the spacebar did a page down, here's another tip that might come in useful: the spacebar also advances to the next slide in just about every presentation program, from PowerPoint to Libre Office to most PDF viewers. I'm amazed at how often I've seen presenters squinting with a flashlight at the keyboard trying to find the right-arrow or down-arrow or page-down or whatever key they're looking for. These are all ways of advancing to the next slide, but they're all much harder to find than that great big spacebar at the bottom of the keyboard.

Tags: ,
[ 16:23 Jan 02, 2017    More tech/web | permalink to this entry | comments ]

Sun, 25 Dec 2016

Photographing Farolitos (and other night scenes)

Excellent Xmas to all! We're having a white Xmas here..

Dave and I have been discussing how "Merry Christmas" isn't alliterative like "Happy Holidays". We had trouble coming up with a good C or K adjective to go with Christmas, but then we hit on the answer: Have an Excellent Xmas! It also has the advantage of inclusivity: not everyone celebrates the birth of Christ, but Xmas is a secular holiday of lights, family and gifts, open to people of all belief systems.

Meanwhile: I spent a couple of nights recently learning how to photograph Xmas lights and farolitos.

Farolitos, a New Mexico Christmas tradition, are paper bags, weighted down with sand, with a candle inside. Sounds modest, but put a row of them alongside a roadway or along the top of a typical New Mexican adobe or faux-dobe and you have a beautiful display of lights.

They're also known as luminarias in southern New Mexico, but Northern New Mexicans insist that a luminaria is a bonfire, and the little paper bag lanterns should be called farolitos. They're pretty, whatever you call them.

Locally, residents of several streets in Los Alamos and White Rock set out farolitos along their roadsides for a few nights around Christmas, and the county cooperates by turning off streetlights on those streets. The display on Los Pueblos in Los Alamos is a zoo, a slow exhaust-choked parade of cars that reminds me of the Griffith Park light show in LA. But here in White Rock the farolito displays are a lot less crowded, and this year I wanted to try photographing them.

Canon bugs affecting night photography

I have a little past experience with night photography. I went through a brief astrophotography phase in my teens (in the pre-digital phase, so I was using film and occasionally glass plates). But I haven't done much night photography for years.

That's partly because I've had problems taking night shots with my current digital SLRcamera, a Rebel Xsi (known outside the US as a Canon 450d). It's old and modest as DSLRs go, but I've resisted upgrading since I don't really need more features.

Except maybe when it comes to night photography. I've tried shooting star trails, lightning shots and other nocturnal time exposures, and keep hitting a snag: the camera refuses to take a photo. I'll be in Manual mode, with my aperture and shutter speed set, with the lens in Manual Focus mode with Image Stabilization turned off. Plug in the remote shutter release, push the button ... and nothing happens except a lot of motorized lens whirring noises. Which shouldn't be happening -- in MF and non-IS mode the lens should be just sitting there intert, not whirring its motors. I couldn't seem to find a way to convince it that the MF switch meant that, yes, I wanted to focus manually.

It seemed to be primarily a problem with the EF-S 18-55mm kit lens; the camera will usually condescend to take a night photo with my other two lenses. I wondered if the MF switch might be broken, but then I noticed that in some modes the camera explicitly told me I was in manual focus mode.

I was almost to the point of ordering another lens just for night shots when I finally hit upon the right search terms and found, if not the reason it's happening, at least an excellent workaround.

Back Button Focus

I'm so sad that I went so many years without knowing about Back Button Focus. It's well hidden in the menus, under Custom Functions #10.

Normally, the shutter button does a bunch of things. When you press it halfway, the camera both autofocuses (sadly, even in manual focus mode) and calculates exposure settings.

But there's a custom function that lets you separate the focus and exposure calculations. In the Custom Functions menu option #10 (the number and exact text will be different on different Canon models, but apparently most or all Canon DSLRs have this somewhere), the heading says: Shutter/AE Lock Button. Following that is a list of four obscure-looking options:

The text before the slash indicates what the shutter button, pressed halfway, will do in that mode; the text after the slash is what happens when you press the * or AE lock button on the upper right of the camera back (the same button you use to zoom out when reviewing pictures on the LCD screen).

The first option is the default: press the shutter button halfway to activate autofocus; the AE lock button calculates and locks exposure settings.

The second option is the revelation: pressing the shutter button halfway will calculate exposure settings, but does nothing for focus. To focus, press the * or AE button, after which focus will be locked. Pressing the shutter button won't refocus. This mode is called "Back button focus" all over the web, but not in the manual.

Back button focus is useful in all sorts of cases. For instance, if you want to autofocus once then keep the same focus for subsequent shots, it gives you a way of doing that. It also solves my night focus problem: even with the bug (whether it's in the lens or the camera) that the lens tries to autofocus even in manual focus mode, in this mode, pressing the shutter won't trigger that. The camera assumes it's in focus and goes ahead and takes the picture.

Incidentally, the other two modes in that menu apply to AI SERVO mode when you're letting the focus change constantly as it follows a moving subject. The third mode makes the * button lock focus and stop adjusting it; the fourth lets you toggle focus-adjusting on and off.

Live View Focusing

There's one other thing that's crucial for night shots: live view focusing. Since you can't use autofocus in low light, you have to do the focusing yourself. But most DSLR's focusing screens aren't good enough that you can look through the viewfinder and get a reliable focus on a star or even a string of holiday lights or farolitos.

Instead, press the SET button (the one in the middle of the right/left/up/down buttons) to activate Live View (you may have to enable it in the menus first). The mirror locks up and a preview of what the camera is seeing appears on the LCD. Use the zoom button (the one to the right of that */AE lock button) to zoom in; there are two levels of zoom in addition to the un-zoomed view. You can use the right/left/up/down buttons to control which part of the field the zoomed view will show. Zoom all the way in (two clicks of the + button) to fine-tune your manual focus. Press SET again to exit live view.

It's not as good as a fine-grained focusing screen, but at least it gets you close. Consider using relatively small apertures, like f/8, since it will give you more latitude for focus errors. Yyou'll be doing time exposures on a tripod anyway, so a narrow aperture just means your exposures have to be a little longer than they otherwise would have been.

After all that, my Xmas Eve farolitos photos turned out mediocre. We had a storm blowing in, so a lot of the candles had blown out. (In the photo below you can see how the light string on the left is blurred, because the tree was blowing around so much during the 30-second exposure.) But I had fun, and maybe I'll go out and try again tonight.


An excellent X-mas to you all!

Tags:
[ 12:30 Dec 25, 2016    More photo | permalink to this entry | comments ]

Thu, 22 Dec 2016

Tips on Developing Python Projects for PyPI

I wrote two recent articles on Python packaging: Distributing Python Packages Part I: Creating a Python Package and Distributing Python Packages Part II: Submitting to PyPI. I was able to get a couple of my programs packaged and submitted.

Ongoing Development and Testing

But then I realized all was not quite right. I could install new releases of my package -- but I couldn't run it from the source directory any more. How could I test changes without needing to rebuild the package for every little change I made?

Fortunately, it turned out to be fairly easy. Set PYTHONPATH to a directory that includes all the modules you normally want to test. For example, inside my bin directory I have a python directory where I can symlink any development modules I might need:

mkdir ~/bin/python
ln -s ~/src/metapho/metapho ~/bin/python/

Then add the directory at the beginning of PYTHONPATH:

export PYTHONPATH=$HOME/bin/python

With that, I could test from the development directory again, without needing to rebuild and install a package every time.

Cleaning up files used in building

Building a package leaves some extra files and directories around, and git status will whine at you since they're not version controlled. Of course, you could gitignore them, but it's better to clean them up after you no longer need them.

To do that, you can add a clean command to setup.py.

from setuptools import Command

class CleanCommand(Command):
    """Custom clean command to tidy up the project root."""
    user_options = []
    def initialize_options(self):
        pass
    def finalize_options(self):
        pass
    def run(self):
        os.system('rm -vrf ./build ./dist ./*.pyc ./*.tgz ./*.egg-info ./docs/sphinxdoc/_build')
(Obviously, that includes file types beyond what you need for just cleaning up after package building. Adjust the list as needed.)

Then in the setup() function, add these lines:

      cmdclass={
          'clean': CleanCommand,
      }

Now you can type

python setup.py clean
and it will remove all the extra files.

Keeping version strings in sync

It's so easy to update the __version__ string in your module and forget that you also have to do it in setup.py, or vice versa. Much better to make sure they're always in sync.

I found several version of that using system("grep..."), but I decided to write my own that doesn't depend on system(). (Yes, I should do the same thing with that CleanCommand, I know.)

def get_version():
    '''Read the pytopo module versions from pytopo/__init__.py'''
    with open("pytopo/__init__.py") as fp:
        for line in fp:
            line = line.strip()
            if line.startswith("__version__"):
                parts = line.split("=")
                if len(parts) > 1:
                    return parts[1].strip()

Then in setup():

      version=get_version(),

Much better! Now you only have to update __version__ inside your module and setup.py will automatically use it.

Using your README for a package long description

setup has a long_description for the package, but you probably already have some sort of README in your package. You can use it for your long description this way:

# Utility function to read the README file.
# Used for the long_description.
def read(fname):
    return open(os.path.join(os.path.dirname(__file__), fname)).read()
    long_description=read('README'),

Tags: , ,
[ 10:15 Dec 22, 2016    More programming | permalink to this entry | comments ]