Shallow Thoughts : : Jan
Akkana's Musings on Open Source Computing and Technology, Science, and Nature.
Fri, 27 Jan 2012
When SCALE approved my talk proposal,
with Linux and Devices, I had a challenge: I needed some good,
visual Arduino demos that would work in front of an audience.
In particular, I wanted something that moved. A little toy truck?
A moving penguin? A rotating sunflower? I fiddled with this and that,
not fully satisfied with anything. And then suddenly I realized what I needed.
Something cool. Something BIG.
Something I'd been wanting an excuse to buy anyway.
An Air Swimmers Shark.
I'd seen these things on video, but never in person. They're available
all over, even on Amazon, so I put in an order there and got a shark
in a few days.
These things are ridiculous and cool. It's huge, about 5 feet long,
and filled with helium. It takes maybe half an hour to assemble.
It has a small motor to beat the tail, an infrared transmitter,
and a weighted receiver that moves back and forth on a track to tilt
the fish up or down as it swims.
Once it's assembled, you can get it filled with helium at a party
store (which costs $3 to $6 depending on where you go).
Once the shark is filled, you add clay as ballast until the shark is
neutrally buoyant, neither rising nor sinking. It's quite sensitive:
you'll find yourself needing to add or remove pea-sized chunks of clay
as the temperature in the room changes, but being a little over- or
under-ballasted doesn't hurt it much.
With its tail beating, the shark really does look like it's swimming
through the air.
My shark is named Bruce, after the mechanical shark used for the movie
"Jaws". My Bruce, I'm happy to say, has been much more tractable than
his famously intemperate namesake.
Okay, now how do we turn this ridiculous-but-cool thing into an Arduino
Hacking the transmitter
There were two possible approaches. First, mount an Arduino directly
on the shark, and let it be totally self-directed. Second, patch the
Arduino into the shark's transmitter and control it from Linux.
I chose the second option, for several reasons. First, I was fairly
sure it would be easier, and less invasive (the shark would still be
usable with manual control). I also liked the idea of keeping the
transmitter as a manual override, in case my control program didn't
work right. Finally, I liked the idea of keeping a Linux machine in the
loop -- the shark would actually be controlled by Linux, not just by
So the first thing I did was take the transmitter apart (4 Philips screws).
Inside are 4 pushbuttons, for right, left, up, and down, and the circuit
board is nice and simple. Whew -- this might be doable!
Four more screws and I had access to the back of the board, which was
equally simple. Now I could get my voltmeter on the contacts while I
It turned out the contacts (indicated with arrows on the photo) on the
downstream side of each switch were normally high (4.5 volts -- the transmitter
uses 3 AAA batteries). When I pushed the button, the contact went to ground.
Okay, so what I needed was some way for the Arduino to ground those
contacts at will.
First I needed to solder some wires to the contacts.
(How do you tell which side of the pushbutton is the one you need to solder?
Well, one side changes voltage when you press the button, and the other
side stays constant. The one that changes is the one you need to connect
to the Arduino, so the Arduino can change it too.)
I figured I needed 6 wires: ground, power, and one for each switch.
(It turned out I didn't need the power wire, but I figured it didn't
hurt to include it just in case.)
I wanted to have a nice small connector on the side of the transmitter,
but I couldn't find any 6-pin connectors that didn't look big and bulky,
so I gave up and decided I'd just let my ribbon cable dangle from the
transmitter. If I got a nice multi-colored one, maybe it would look festive.
I couldn't find any 6-conductor ribbon cable, so I got a
wider one and separated 6 wires from the rest. Then I soldered the
six wires to the appropriate places (marked by arrows in the photo).
On the other end, I tinned the six wires with solder so I could plug
the stranded wires into my breadboard.
Simulating button presses
I've done enough reading to know of three ways to simulate a button press.
You can put a relay between the two contacts of the switch; you can do
the same thing, but with an optocoupler (opto-isolator) instead of a relay;
or you can do some magic with a transistor. I was fuzzy on the
transistor magic bit, so a relay sounded easiest.
I played around with a relay and a spare switch and convinced myself I
knew how to wire them up. Then it was off to my local parts store to
buy four matched relays small enough to fit on my little mini breadboard.
There followed a several-day fiasco wherein I bought lots of relays
that turned out not to be suitable, and got increasingly frustrated at
how large and clunky all the available relays were.
There are smaller ones, but I couldn't get them to work.
And I learned that relays mostly come without documentation on which
pin does which, so there's a lot of experimenting with each new type.
Frustrated, I tried some optocouplers I'd bought on a whim last year.
No dice ... couldn't get them to work either.
Desperate, I turned to IRC, #arduino on Freenode. The folks there are
mostly electronics wizards, and I'm sure my questions must have seemed
very dumb, but they were patient with me, and pointed me toward
a very simple circuit,
that was just what I needed. (They also suggested
collector article, but I found that less clear.)
It took me some
experimenting with a transistor, an LED and a couple of resistors
(I blew out a couple of transistors before I realized I had the R2 resistor
in the wrong place) but eventually I got it working, and felt confident
enough to try it with the real shark transmitter. The key was to simplify
the circuit so it had no extra parts, then once it was working, add more
parts to build it up to what I needed.
At left, the circuit I ended up with. For each button, I have one
transistor and one resistor (I don't need the second resistor from the
LED4dummies circuit, since that was just to keep the LED from burning out).
At right is the circuit assembled on a mini-breadboard on top of the
Note that the ends of the ribbon cable are plugged in to a spare header I had
lying around; a header makes a passable connector, so I can plug it in
fairly easily right before a talk. The green and blue wires in the back
are going to Arduino digital output pins 3 through 6 (leaving 0 and 1
for serial I/O). The red wires go from the transistors back to the
ribbon cable wires that go to the shark's transmitter buttons.
The software side
Now I could make the shark think I'd pressed a button on its transmitter.
How do I control that from Linux?
On the Arduino side, I wrote a simple program that reads and parses
commands coming over the USB cable. So from the computer, I might send
a line like
L 300, and the Arduino would "press" the Left
button for 300 milliseconds. I had already written something like this
for a couple of other Arduino programs. That program is
On the Linux side, first I needed something that established a serial
connection and sent commands to the Arduino. I wrote a Python class
That let me send commands from the Python console to test the shark.
Then I needed something more visual, something I could show during a talk.
In particular, the shark doesn't swim unless someone's pressing left,
right, left, right buttons over and over. Of course I wanted the computer
to handle that part.
So I wrote a little Python-GTK application that keeps the shark swimming,
and lets me drag a mouse around to adjust its left/right up/down direction:
Purely by coincidence, the week before SCALE, Scott Adams introduced a
Dilbert, Jan 11 2012.
Nice timing for the debut of my own roboshark!
Sadly, I don't have any photos or video of the shark in action.
But if you're a LWN subscriber, there's an article on my talk
with a couple of great pictures:
Robots rampage (in a
friendly way) at SCALE 10X.
And you can see my slides and notes at
[ 14:02 Jan 27, 2012
More hardware |
permalink to this entry |
Tue, 17 Jan 2012
I've long wanted a way of converting my HTML presentation slides to PDF.
Mostly because conference organizers tend to freak out at slides in any
format other than Open/Libre Office, Powerpoint or PDF. Too many times,
I've submitted a tarball of my slides and discovered they weren't even
listed on the conference site. (I ask you, in what way is a tarball
more difficult to deal with than an .odp file?) Slide-sharing websites
also have a limited set of formats they'll accept.
A year or so ago, I added screenshot capability to my webkit-based
do "screenshots", but I really needed PDF, not images.
Now, creating PDF from HTML shouldn't be that hard. Every browser has
a print function that can print to a PDF file. So why is it so hard
to create PDF from HTML in any kind of scriptable way?
After much searching and experimenting,
I finally found a Python code snippet that worked:
to PDF using PyGTK4 Webkit from Alex Dong. It uses Python-Qt,
not GTK, so I can't integrate it into my Preso app, but that's okay --
a separate tool is just as good.
(I struggled to write an equivalent in PyGTK, but gave up due to the
complete lack of documentation of Python-Webkit-GTK, and not much
more for gtk.PrintOperation(). QWebView's documentation may not be
as complete as I'd like, but at least there is some.)
Printing from QtWebView to QPrinter
Here are the important things I learned about QWebView from fiddling
around with Alex's code to adapt it to what I needed, which is
printing a list of pages to sequentially numbered files:
- To print, you need to wait until the page has finished loading,
so connect a function to SIGNAL("loadFinished(bool)"),
- That loadFinished function remains registered, so as you load new
pages, it will be called each time. So you can load() the next URL
as the last step in your loadFinished callback.
- If you get confused about callbacks and connect more than one of
them, bad things happen, and only the last page gets printed, or
QApplication.exit() doesn't exit at all.
Things I learned about QPrinter():
- All the examples I found online set the page size with lines like
QPrinter.setPageSize(QPrinter.A4) or setPaperSize(QPrinter.A4)
(setPageSize is apparently deprecated in favor of setPaperSize); but
- If you want to set a specific size, you can do that with a line like
QPrinter.setPaperSize(QSizeF(1024, 768), QPrinter.DevicePixel)
The second argument (DevicePixel) is a
from this list.
- That line gives you the right aspect ratio. But if you think
"DevicePixels" means the size will correspond to pixels in your browser
window (just because the documentation says so), you're sadly mistaken.
- If you want a PDF page that actually corresponds to the size of your
browser window, you can get it by calling QWebView.setZoomFactor(z)
You'll have to experiment to find the right value of z;
I found I needed about 1.24 if I wanted to capture my
full 1366x768 slides, or exactly 2.0 if I wanted to restrict the
saved PDF to only the 1024x758 part that shows up in the projector.
Anyway, it's a little hacky with that empirical zoom factor ... but it works!
The program is here:
convert HTML to PDF using Qt Webkit.
And it does produce reasonable PDF, with the text properly vectorized,
not just raster screenshots of each page.
Printing the slides in the right order
Terrific -- now I can feed a list of slides to qhtmlprint and get a
bunch of PDF files back. How do I print the right slides?
one per line.
grep .html navigate.js, I get a list like this:
To pass that to qhtmlprint, I only need to remove the commented-out lines
(the ones with //) and strip off the quotes and commas. I can do that
all in one command with a grep and sed pipeline:
qhtmlprint ` fgrep .html navigate.js | grep -v // | sed -e 's/",/"/' -e 's/"//g' `
And voiaà! I have a bunch of fileNNN.pdf files.
Creating a multi-page slide deck
Okay, great! Now how do I stick those files all together into one
slide deck I can submit to conference organizers?
That part's easy -- Ghostscript can do it.
gs -dNOPAUSE -sDEVICE=pdfwrite -sOUTPUTFILE=slidedeck.pdf -dBATCH file*.pdf
And now slidedeck.pdf contains my whole presentation, ready to go.
[ 12:16 Jan 17, 2012
More programming |
permalink to this entry |
Thu, 12 Jan 2012
When I give talks that need slides, I've been using my
I uploaded it in 2007 -- then left it there, without many updates.
But meanwhile, I've been giving lots of presentations, tweaking the code,
tweaking the CSS to make it display better. And every now and then I get
reminded that a few other people besides me are using this stuff.
For instance, around a year ago, I gave a talk where nearly all the
slides were just images. Silly to have to make a separate HTML file
to go with each image. Why not just have one file, img.html, that
can show different images? So I wrote some code that lets you go to
a URL like img.html?pix/whizzyphoto.jpg, and it will display
it properly, and the Next and Previous slide links will still work.
Of course, I tweak this software mainly when I have a talk coming up.
I've been working lately on my SCALE talk, coming up on January 22:
with Linux and Devices (be ready for some fun Arduino demos!)
Sometimes when I overload on talk preparation, I procrastinate
by hacking the software instead of the content of the actual talk.
So I've added some nice changes just in the past few weeks.
For instance, the speaker notes that remind me of where I am in
the talk and what's coming next. I didn't have any way to add notes on
image slides. But I need them on those slides, too -- so I added that.
Then I decided it was silly not to have some sort of automatic
reminder of what the next slide was. Why should I have to
put it in the speaker notes by hand? So that went in too.
And now I've done the less fun part -- collecting it all together and
documenting the new additions. So if you're using my HTML/JS slide
kit -- or if you think you might be interested in something like that
as an alternative to Powerpoint or Libre Office Presenter -- check
out the presentation I have explaining the package, including the
You can find it here:
[ 21:08 Jan 12, 2012
More speaking |
permalink to this entry |
Sun, 08 Jan 2012
I've been having (mis)adventures learning about Python's various
options for parsing HTML.
Up until now, I've avoided doing any HTMl parsing
in my RSS reader FeedMe.
I use regular expressions to find the places where content starts and
ends, and to screen out content like advertising, and to rewrite links.
Using regexps on HTML is generally considered to be a no-no, but it
didn't seem worth parsing the whole document just for those modest goals.
But I've long wanted to add support for downloading images, so you
could view the downloaded pages with their embedded images if you so chose.
That means not only identifying img tags and extracting their
attributes, but also rewriting the img tag afterward to point to the
locally stored image. It was time to learn how to parse HTML.
Since I'm forever seeing people flamed on the #python IRC channel for
using regexps on HTML, I figured real HTML parsing must be straightforward.
A quick web search led me to
HTMLParser class. It comes with a nice example for how to use it:
define a class that inherits from HTMLParser, then define
some functions it can call for things like handle_starttag and
handle_endtag; then call self.feed(). Something like this:
from HTMLParser import HTMLParser
def fetch_url(self, url) :
request = urllib2.Request(url)
response = urllib2.urlopen(request)
link = response.geturl()
html = response.read()
self.feed(html) # feed() starts the HTMLParser parsing
def handle_starttag(self, tag, attrs):
if tag == 'img' :
# attrs is a list of tuples, (attribute, value)
srcindex = self.has_attr('src', attrs)
if srcindex < 0 :
return # img with no src tag? skip it
src = attrs[srcindex]
# Make relative URLs absolute
src = self.make_absolute(src)
attrs[srcindex] = (attrs[srcindex], src)
print '<' + tag
for attr in attrs :
print ' ' + attr
if len(attr) > 1 and type(attr) == 'str' :
# make sure attr doesn't have any embedded double-quotes
val = attr.replace('"', '\"')
print '="' + val + '"')
def handle_endtag(self, tag):
self.outfile.write('</' + tag.encode(self.encoding) + '>\n')
Easy, right? Of course there are a lot more details, but the
basics are simple.
I coded it up and it didn't take long to get it downloading images
and changing img tags to point to them. Woohoo!
The bad news about HTMLParser
Except ... after using it a few days, I was hitting some weird errors.
In particular, this one:
HTMLParser.HTMLParseError: bad end tag: ''
It comes from sites that have illegal content. For instance, stories
technically illegal html -- but lots of sites do it, so protesting
that it's technically illegal doesn't help if you're trying to read a
Some discussions said setting
self.CDATA_CONTENT_ELEMENTS = () would help, but it didn't.
HTMLParser's code is in Python, not C. So I took a look at where the
errors are generated, thinking maybe I could override them.
It was easy enough to redefine
parse_endtag() to make it not throw
an error (I had to duplicate some internal strings too). But then I
hit another error, so I redefined
And then I hit another error. I'm sure you see where this was going.
Pretty soon I had over 100 lines of duplicated code, and I was still
getting errors and needed to redefine even more functions.
This clearly wasn't the way to go.
I'd been trying to avoid adding dependencies to additional Python packages,
but if you want to parse real-world HTML, you have to.
There are two main options: Beautiful Soup and lxml.html.
Beautiful Soup is popular for large projects, but the consensus seems
to be that lxml.html is more error-tolerant and lighter weight.
Indeed, lxml.html is much more forgiving. You can't handle start and
end tags as they pass through, like you can with HTMLParser. Instead
you parse the HTML into an in-memory tree, like this:
tree = lxml.html.fromstring(html)
How do you iterate over the tree? lxml.html is a good parser, but it
has rather poor documentation, so it took some struggling to figure out
what was inside the tree and how to iterate over it.
You can visit every element in the tree with
for e in tree.iter() :
But that's not terribly useful if you need to know which
tags are inside which other tags. Instead, define a function that iterates
over the top level elements and calls itself recursively on each child.
The top of the tree itself is an element -- typically the
<html></html> -- and each element has .tag and .attrib.
If it contains text inside it (like a <p> tag), it also has
.text. So to make something that works similarly to HTMLParser:
def crawl_tree(tree) :
if tree.text :
for node in tree :
But wait -- we're not quite all there. You need to handle two
First, comment tags are special: their tag attribute,
instead of being a string, is <built-in function Comment>
so you have to handle that specially and not assume that tag
is text that you can print or test against.
Second, what about cases like
<p>Here is some <i>italicised</i> text.</p>
? in this case, you have the p tag, and its text is
"Here is some ".
Then the p has a child, the i tag, with text of "italicised".
But what about the rest of the string, " text."?
That's called a tail -- and it's the tail of the adjacent i tag it follows,
not the parent p tag that contains it. Confusing!
So our function becomes:
def crawl_tree(tree) :
if type(tree.tag) is str :
if tree.text :
for node in tree :
if tree.tail :
See how it works? If it's a comment (tree.tag isn't a string),
we'll skip everything -- except the tail. Even a comment
might have a tail:
<p>Here is some <!-- this is a comment --> text we want to show.</p>
so even if we're skipping comment we need its tail.
I'm sure I'll find other gotchas I've missed, so I'm not releasing
this version of feedme until it's had a lot more testing. But it
looks like lxml.html is a reliable way to parse real-world pages.
It even has a lot of convenience functions like link rewriting
that you can use without iterating the tree at all. Definitely worth
[ 15:04 Jan 08, 2012
More programming |
permalink to this entry |
Tue, 03 Jan 2012
Like most Linux users, I use virtual desktops. Normally my browser
window is on a desktop of its own.
Naturally, it often happens that I encounter a link I'd like to visit
while I'm on a desktop where the browser isn't visible. From some apps,
I can click on the link and have it show up. But sometimes, the link is
just text, and I have to select it, change to the browser desktop,
paste the link into firefox, then change desktops again to do something
else while the link loads.
So I set up a way to load whatever's in the X selection in firefox no
matter what desktop I'm on.
In most browsers, including firefox, you can tell your existing
browser window to open a new link from the command line:
firefox http://example.com/ opens that link in your
existing browser window if you already have one up, rather than
starting another browser. So the trick is to get the text you've selected.
At first, I used a program called xclip. You can run this command:
firefox `xclip -o` to open the selection. That worked
okay at first -- until I hit my first URL in weechat that was so long
that it was wrapped to the next line. It turns out xclip does odd things
with multi-line output; depending on whether it thinks the output is
a terminal or not, it may replace the newline with a space, or delete
whatever follows the newline. In any case, I couldn't find a way to
make it work reliably when pasted into firefox.
After futzing with xclip for a little too long, trying to reverse-engineer
its undocumented newline behavior, I decided it would be easier just to
write my own X clipboard app in Python. I already knew how to do that,
and it's super easy once you know the trick:
primary = gtk.clipboard_get(gtk.gdk.SELECTION_PRIMARY)
if primary.wait_is_text_available() :
That just prints it directly, including any newlines or spaces.
But as long as I was writing my own app, why not handle that too?
It's not entirely necessary on Firefox: on Linux, Firefox has some
special code to deal with pasting multi-line URLs, so you can copy
a URL that spans multiple lines, middleclick in the content area and
things will work. On other platforms, that's disabled, and some Linux
distros disable it as well; you can enable it by going to
about:config and searching for
then setting the preference
editor.singlelinepaste.pasteNewlines to 2.
However, it was easy enough to make my Python clipboard app do the
right thing so it would work in any browser. I used Python's re
(regular expressions) module:
primary = gtk.clipboard_get(gtk.gdk.SELECTION_PRIMARY)
if not primary.wait_is_text_available() :
s = primary.wait_for_text()
# eliminate newlines, and any spaces immediately following a newline:
print re.sub(r'[\r\n]+ *', '', s)
That seemed to work fine, even on long URLs pasted from weechat
with newlines and spaces, like that looked like
All that was left was binding it so I could access it from anywhere.
Of course, that varies depending on your desktop/window manager.
In Openbox, I added two items to my desktop menu in menu.xml:
<item label="open selection in Firefox">
<action name="Execute"><execute>sh -c 'firefox `xclip -o`'</execute></action>
<item label="open selection in new tab">
<action name="Execute"><execute>sh -c 'firefox -new-tab `xclip -o`'</execute></action>
I also added some code in rc.xml inside
<context name="Desktop">, so I can middle-click
or control-middle-click on the desktop to open a link in the browser:
<mousebind button="Middle" action="Press">
<execute>sh -c 'firefox `pyclip`'</execute>
<mousebind button="C-Middle" action="Press">
<execute>sh -c -new-tab 'firefox `pyclip`'</execute>
I set this up maybe two hours ago and I've probably used it ten or
fifteen times already. This is something I should have done long ago!
[ 22:37 Jan 03, 2012
More linux |
permalink to this entry |