Shallow Thoughts

Akkana's Musings on Open Source, Science, and Nature.

Sat, 06 Feb 2010

Making "Citizen Science" compelling

I had the opportunity to participate in a focus group on NASA's new "citizen science" project, called Moon Zoo, with a bunch of other fellow lunatics, amateur astronomers and lunar enthusiasts.

Moon Zoo sounds really interesting. Ordinary people will analyze high-resolution photos of the lunar surface: find out how many boulders and craters are there. I hope it will also include more details like crater type and size, rilles and so forth, though that wasn't mentioned. These are all tasks that are easy for a human and hard for a computer: perfect for crowdsourcing. Think Galaxy Zoo for the moon. The resulting data will be used for planning future lunar missions as well as for general lunar science.

It sounds like a great project and I'm excited about it. But I'm not going to write about Moon Zoo today -- it doesn't exist yet (current estimate is mid-March), though there is a preliminary PDF. Instead, I want to talk about some of the great ideas that came out of the focus group.

The primary question: How do we get people -- both amateur astronomers and the general public, people of all ages -- interested in contributing to a citizen science project like Moon Zoo?

Here are some of the key ideas:

Make the data public

This was the most important point, echoed by a lot of participants. Some people felt that many of the existing "citizen science" projects project the attitude "We want something from you, but we're not going to give you anything in return." If you use crowdsourcing to create a dataset, make it available to the crowd.

Opening the data has a lot of advantages:

Projects like Wikipedia and Open Street Map, as well as Linux and the rest of the open source movement, show how much an open data model can inspire contributions.

Give credit to individuals and teams

People cited the example of SETI@Home, where teams of contributors can compete to see who's contributed the most. Show rankings for both individuals and groups, so they can track their progress and maybe get a bit competitive with other groups. Highlight groups and individuals who contribute a lot -- maybe even make it a formal competition and offer inexpensive prizes like T-shirts or mugs.

A teenaged panel member had the great suggestion of making buttons that said "I'm a Moon Zookeeper." Little rewards like that don't cost much but can really motivate people.

Offer an offline version

They wanted to hear ideas for publicizing Moon Zoo to groups like our local astronomy clubs.

I mentioned that I've often wanted to spread the word about Galaxy Zoo, but it's entirely a web-based application and when I give talks to clubs or school groups, web access is never an option. (Ironically, the person leading the focus group had planned to demonstrate Galaxy Zoo to us but couldn't get connected to the wi-fi at the Lawrence Hall of Science.)

Projects are so much easier to evangelize if you can download an offline demo.

And not just a demo, either. There should be a way to download a real version, including a small data set. Imagine if you could grab a Moon Zoo pack and do a little classifying whenever you got a few spare minutes -- on the airplane or train, or in a hotel room while traveling.

Important note: this does not mean you should write a separate Windows app for people to download. Keep it HTML, Javascript and cross platform so everyone can run it. Then let people download a local copy of the same web app they run on your site.

Make sure it works on phones and game consoles

Lots of people use smartphones more than they use a desktop computer these days. Make sure the app runs on all the popular smartphones. And lots of kids have access to handheld web-enabled game consoles: you can reach a whole new set of kids by supporting these platforms.

Offer levels of accomplishment, like a game

Lots of people are competitive by nature, and like to feel they're getting better at what they're doing. Play to that: let users advance as they get more experienced, and give them the option of doing harder projects. "I'm up to level 7 in Moon Zoo!"

Use social networking

Facebook. Twitter. Nuff said.

Don't keep results a secret

Quite a few scientific publications have arisen out of Galaxy Zoo -- yet although most of us were familiar with Galaxy Zoo, few of us knew that. Why so secretive? They should be trumpeting achievements like that.

How many times have you volunteered for a survey or study, then wondered for years afterward how the results came out? Researchers never contact the volunteers when the paper is finally published. It's frustrating and demotivating; it makes you not want to volunteer again. Lots of us sign up because we're curious about the science -- but that means we're also curious about the results.

With citizen science projects, this is particularly easy. Set up a mailing list or forum (or both) to discuss results and announce when papers are published. Set up a Twitter account and a Facebook group to announce new papers to anyone who wants to follow. This is the age of Web 2.0, folks -- there's no excuse for not communicating.

I don't know if NASA will listen to our ideas. But I hope they do. Moon Zoo promises to be a terrific project ... and the more of these principles they follow, the more dedicated volunteers they'll get and that will make the project even better.

Tags: , , ,
[ 19:25 Feb 06, 2010    More science/astro | permalink to this entry ]

Tue, 02 Feb 2010

Configuring git colors

I spent a morning wrestling with git after writing a minor GIMP fix that I wanted to check in. Deceptively simple ideas, like "Check the git log to see the expected format of check-in messages", turned out to be easier said than done.

Part of the problem was git's default colors: colors calculated to be invisible to anyone using a terminal with dark text on a light background. And that sent me down the perilous path of git configuration.

git-config does have a manual page. But it lacks detail: you can't get from there to knowing what to change so that the first line of commits in git log doesn't show up yellow.

But that's okay, thought I: all I need to do is list the default settings, then change anything that's a light color like yellow to a darker color. Easy, right?

Well, no. It turns out there's no way to get the default settings -- because they aren't part of git's config; they're hardwired into the C code.

But you can find most of them with a seach for GIT_COLOR in the source. The most useful lines are these the ones in diff.c, builtin-branch.c and wt-status.c.

gitconfig

The next step is to translate those C lines to git preferences, something you can put in a .gitconfig. Here's a list of all the colors mentioned in the man page, and their default values -- I used "normal" for grep and interactive where I wasn't sure of the defaults.

[color "diff"]
	plain = normal
	meta = bold
	frag = cyan
	old = red
	new = green
	commit = yellow
	whitespace = normal red
[color "branch"]
	current = green
	local = normal
	remote = red
	plain = normal
[color "status"]
	header = normal
	added = red
	updated = green
	changed = red
	untracked = red
	nobranch = red
[color "grep"]
	match = normal
[color "interactive"]
	prompt = normal
	header = normal
	help = normal
	error = normal

The syntax and colors are fairly clearly explained in the manual: allowable colors are normal, black, red, green, yellow, blue, magenta, cyan and white. After the foreground color, you can optionally list a background color. You can also list an attribute, chosen from bold, dim, ul, blink and reverse -- only one at a time, no combining of attributes.

So if you really wanted to, you could say something like

[color "status"]
	header = normal blink
	added = magenta yellow
	updated = green reverse
	changed = red bold
	untracked = blue white
	nobranch = red white bold

Minimal changes for light backgrounds

What's the minimum you need to get everything readable? On the light grey background I use, I needed to change the yellow, cyan and green entries:

[color "diff"]
	frag = cyan
	new = green
	commit = yellow
[color "branch"]
	current = green
[color "status"]
	updated = green

Disclaimer: I haven't tested all these settings -- because I haven't yet figured out where all of them apply. That's another area where the manual is a bit short on detail ...

Tags: , ,
[ 22:26 Feb 02, 2010    More programming | permalink to this entry ]

Thu, 28 Jan 2010

On Linux Planet: a simple Poker game in Python-Qt

[Poker game in py-qt] I've written in the past about Python GUI programming using the GTK and Tk toolkits, and several KDE fans felt that I was slighting the much nicer looking Qt.

So my latest article on Linux Planet, Make Pretty GUI Apps Fast with Python-Qt, shows how to develop a little poker game using the python-qt toolkit.

I didn't want to dwell on it in the article (and didn't have space anyway), but pyqt turned out to be a bit of a pain. There's no official documentation -- or at least nothing that's obviously official -- and a lot of the examples on google are out of date because of API changes. None of the tutorial examples explain much, and they never demonstrate the practical features I'd want to do in a real app. It was surprisingly hard to come up with an application idea that worked well, looked good and was still easy to explain.

And don't get me started on this whole "Slots and signals are revolutionarily different even though they look just like the callbacks every other toolkit has used for the last three decades" meme. I'm sure there is a subtle technical difference -- but if there's a difference that matters to the average UI programmer, their documentation sure doesn't make it clear.

All that aside, PyQt (and Qt in general) does produce very pretty apps and is worth trying for that reason.

[spade] [diamond] [club] [heart]

The suit images in the article are adapted from some suits I found on Wikimedia Commons (the "Naipe" set). I wanted them to look more 3-dimensional, so I applied my blobipy GIMP script as well as scaling and resizing them. I really liked those shiny-looking Tango heart and spade emblems (also on the Wikimedia Commons page) but I couldn't find a diamond or club to match.

The poker program I wrote has menus and a second round of dealing, where you can mark off the cards you want to keep. I couldn't fit all that in a 700-word article, but the complete program is available here: qpoker.py or you can get it in a tarball along with the suit images at qpoker.tar.gz.

Tags: , , ,
[ 09:53 Jan 28, 2010    More programming | permalink to this entry ]

Mon, 25 Jan 2010

How to print text files in Ubuntu Karmic

Ever since I upgraded to Ubuntu 9.10 "Karmic koala", printing text files has been a problem. They print out with normal line height, but in a super-wide font so I only get about 48 ugly characters per line.

Various people have reported the problem -- for instance, bug 447961 and this post -- but no one seemed to have an answer.

I don't have an answer either, but I do have a workaround. The problem is that Ubuntu is scaling incorrectly. When it thinks it's putting 10 characters per inch (cpi) on a line, it's actually using a font that only fits 6 characters. But if you tell it to fit 17 characters per inch, that comes out pretty close to the 10cpi that's supposed to be the default:

lpr -o cpi=17 filename

As long as you have to specify the cpi, try different settings for it. cpi=20 gives a nice crisp looking font with about 11.8 characters per inch. If needed, you can adjust line spacing with lpi=NN as well.

Update: The ever-vigilant Till Kamppeter has tracked the problem down to the font used by texttopdf for lp/lpr printing. Interesting details in bug 447961.

Tags: , , ,
[ 15:36 Jan 25, 2010    More linux | permalink to this entry ]

Wed, 20 Jan 2010

Newt nookie!

[Newt nookie at Lake Ranch] Last weekend, on a tip posted on a local birding list, we hiked up to the little pond at Lake Ranch, above Sanborn county park, where a major California newt orgy is in progress.

There were thousands of newts throughout the lake, but especially by the dam, where they were mating and laying eggs.

I had never realized how much the male newts' appearance differs from the females -- or possibly, it doesn't except at this time of year. Most of the year, when we see newts they look like these females, with orange-red skin and lizard-like feet. But here the males look very different: larger, darker, often patterned with stripes or spots, with huge flipper-like feet and greatly flattened tails.

Most of the females were gravid with eggs already. The males seem to be able to tell when a female has already been fertilized, but only from up close: they'll pursue a female to a few inches away, then turn back if she's recently mated.

We saw some multi-newt orgies, with two or three males nosing each other to get access to a female; but mostly we saw pairs clasped in long-lasting embraces. We watched a few pairs for five or ten minutes.
[California newt laying her egg sac]

Some of the females laid their grape-sized egg sacs near where they mated, by the dam; but upstream, closer to the Black Rd end of the pond, we found a nursery where the pond floor was just covered with egg sacs. Is it safer for the eggs here, away from the newt festivities? Or is the temperature or oxygen content different?

Photos are a bit challenging. There's a lot of reflection off the surface of the water. The raw photos are just a sea of murky green, but a little contrast boosting in GIMP, and sometimes a bit of layer mode/layer mask work, brings out a lot more detail than I expected.

There were a few frogs singing, too. We couldn't see the frogs, but we did see a few schools of what might have been tadpoles (or else tiny fish). We also saw one huge tadpole, with a head like a squashed ping-pong ball. I hope the bullfrogs from Walden West pond haven't migrated up to Lake Ranch. It's fun to watch them at Walden West, but bullfrogs could wreak havoc on the pond's other wildlife. (Can bullfrogs eat newts? Most animals can't -- newts have poisonous skins. But we've never seen any newts at Walden West.)

If you go to see the newts, watch your step on the trails. After egg-laying, the females apparently leave the pond and go wandering cross-country. (Where do the males go?) We saw at least three females heading down the steep trail toward Sanborn, and a couple more on the flat trail above the lake that heads toward Black Rd. They move slowly and purposefully, and can't scurry out of your way to keep from getting stepped on. So be careful, and enjoy the show!

Newt nookie photos here.

Tags: ,
[ 11:06 Jan 20, 2010    More nature | permalink to this entry ]

Sun, 17 Jan 2010

Displaying images from Javascript file inputs

(despite Firefox's attempts to prevent that)

My Linux Planet article last week was on printing pretty calendars. But I hit one bug in Photo Calendar. It had a HTML file chooser for picking an image ... and when I chose an image and clicked Select to use it. it got the pathname wrong every time.

I poked into the code (Photo Calendar's code turned out to be exceptionally clean and well documented) and found that it was expecting to get the pathname from the file input element's value attribute. But input.File.value was just returning the filename, foo.jpg, instead of the full pathname, /home/user/Images/yosemite/foo.jpg. So when the app tried to make it into a file:/// URL, it ended up pointing to the wrong place.

It turned out the cause was a security change in Firefox 3. The issue: it's considered a security hole to expose full pathnames on your computer to Javascript code coming from someone else's server. The Javascript could give bad guys access to information about the directory structures on your disk. That's a perfectly reasonable concern, and it makes sense to consider it as a security hole.

The problem is that this happens even when you're running a local app on your local disk. Programs written in any other language and toolkit -- a Python program using pygtk, say, or a C++ Qt program -- have access to the directories on your disk, but you can't use Javascript inside Firefox to do the same thing. The only ways to make an exception seems to be an elaborate procedure requiring the user to change settings in about:config. Not too helpful.

Perhaps this is even reasonable, given how common cross-site scripting bugs have been in browsers lately -- maybe running a local script really is a security risk if you have other tabs active. But it leaves us with the problem of what to do about apps that need to do things like choose a local image file, then display it.

And it turns out there is: a data URL. Take the entire contents of the file (ouch) and create a URL out of those contents, then set the src attribute of the image to that.

Of course, that makes for a long, horrifying, unreadable URL -- but the user never has to see that part. I suspect it's also horribly memory intensive -- the image has to be loaded into memory anyway, to display it, but is Firefox also translating all of that to a URL-legal syntax? Obviously, any real app using this technique had better keep an eye on memory consumption. But meanwhile, it fixes Photo Calendar's file button.

Here's what the code looks like:

  img = document.getElementById("pic");
  fileinput = document.input.File;
  if (img && fileinput)
    img.src = fileinput.files[0].getAsDataURL();

Here's a working minimal demo of using getAsDataURL() with a file input.

Tags: , ,
[ 13:57 Jan 17, 2010    More programming | permalink to this entry ]

Thu, 14 Jan 2010

Print Beautiful Custom Calendars in Linux With Photo Calendar

Didn't get the calendar you wanted for Christmas this year? Print your own, with your choice of photos and holidas. My Linux Planet Photo Calendar article shows how.

Tags: , ,
[ 16:53 Jan 14, 2010    More writing | permalink to this entry ]

Wed, 13 Jan 2010

Taming Emacs' text mode wrapping and indenting

To wrap long lines, or not to wrap? It's always a dilemma. Automatic wrapping is great when you're hammering away typing lots of text. But it's infuriating when you're trying to format something yourself and the editor decides it wants to line-wrap a little too early.

Although of course you can set the wrapping width, Emacs has a tendency to wrap early -- especially when you hit return. All too often, I'll be typing away at a long line, get to the end of the sentence and paragraph with the last word on the same line with the rest -- then realize that as soon as I hit return, Emacs is going to move that last word to a line by itself. Drives me nuts!

And the solution turns out to be so simple. The Return key, "\C-m". was bound to the (newline) function (you can find out by typing M-x, then describe-key, then hitting Return). Apparently (newline) re-wraps the current line before inserting a line break. But I just wanted it to insert a line break.

No problem -- just bind "C-m" to (insert "\n").

But there's a second way, too, if you don't want to rebind: there's a magic internal emacs table you can change.

(set-char-table-range auto-fill-chars 10 nil)

But wait -- there's one other thing I want to fix in text mode.

Automatic indent is another one of those features that's very convenient ... except when it's not.

If I have some text like:

First point:
  - subpoint a
  - subpoint b
then it's handy if, when I hit Return after subpoint a, emacs indents to the right level for subpoint b. But what happens when I get to the end of that list?
First point:
  - subpoint a
  - subpoint b

Second point:
  - subpoint c

When I hit Return after subpoint b, Emacs quite reasonably indents two spaces. If I immediately type another Return, Emacs sensibly deletes the two spaces it just inserted, opens a new line -- but then it indents that new line another two spaces.

After a blank line, I always want to start at the beginning, not indented at all.

Here's how to fix that. Define a function that will be called whenever you hit return in text mode. That function tests whether the caret comes immediately after a blank line, or at the beginning of the file. It indents except in those two cases; and in neither case does it re-wrap the current line.

;; In text mode, I don't want it auto-indenting for the first
;; line in the file, or lines following blank lines.
;; Everywhere else is okay.
(defun newline-and-text-indent ()
  "Insert a newline, then indent the next line sensibly for text"
  (interactive)
  (cond
   ;; Beginning of buffer, or beginning of an existing line, don't indent:
   ((or (bobp) (bolp)) (newline))

   ;; If we're on a whitespace-only line,
   ((and (eolp)
         (save-excursion (re-search-backward "^\\(\\s \\)*$"
                                             (line-beginning-position) t)))
    ;; ... delete the whitespace, then add another newline:
    (kill-line 0)
    (newline))

   ;; Else (not on whitespace-only) insert a newline,
   ;; then add the appropriate indent:
   (t (insert "\n")
      (indent-according-to-mode))
   ))

Then tell emacs to call that function when it sees the Return key in text mode:

(defun text-indent-hook ()
  (local-set-key "\C-m" 'newline-and-text-indent)
  )
(setq text-mode-hook 'text-indent-hook)

Finally, this is great for HTML mode too, if you get irritated at not being able to put an <a href="longurl"> all on one line:

(defun html-hook ()
  (local-set-key "\C-m" (lambda () (interactive) (insert "\n")))
  )
(setq sgml-mode-hook 'html-hook)

Tags: , ,
[ 10:29 Jan 13, 2010    More linux/editors | permalink to this entry ]