Shallow Thoughts : : Jan

Akkana's Musings on Open Source Computing and Technology, Science, and Nature.

Fri, 27 Jan 2012

Arduino Air Swimmers Shark

[Air swimmers shark]

When SCALE approved my talk proposal, Fun with Linux and Devices, I had a challenge: I needed some good, visual Arduino demos that would work in front of an audience.

In particular, I wanted something that moved. A little toy truck? A moving penguin? A rotating sunflower? I fiddled with this and that, not fully satisfied with anything. And then suddenly I realized what I needed. Something cool. Something BIG. Something I'd been wanting an excuse to buy anyway.

An Air Swimmers Shark.

I'd seen these things on video, but never in person. They're available all over, even on Amazon, so I put in an order there and got a shark in a few days.

These things are ridiculous and cool. It's huge, about 5 feet long, and filled with helium. It takes maybe half an hour to assemble. It has a small motor to beat the tail, an infrared transmitter, and a weighted receiver that moves back and forth on a track to tilt the fish up or down as it swims.

Once it's assembled, you can get it filled with helium at a party store (which costs $3 to $6 depending on where you go). Once the shark is filled, you add clay as ballast until the shark is neutrally buoyant, neither rising nor sinking. It's quite sensitive: you'll find yourself needing to add or remove pea-sized chunks of clay as the temperature in the room changes, but being a little over- or under-ballasted doesn't hurt it much. With its tail beating, the shark really does look like it's swimming through the air.

My shark is named Bruce, after the mechanical shark used for the movie "Jaws". My Bruce, I'm happy to say, has been much more tractable than his famously intemperate namesake.

Okay, now how do we turn this ridiculous-but-cool thing into an Arduino project?

Hacking the transmitter

[Air Swimmers IR transmitter, front]

There were two possible approaches. First, mount an Arduino directly on the shark, and let it be totally self-directed. Second, patch the Arduino into the shark's transmitter and control it from Linux. I chose the second option, for several reasons. First, I was fairly sure it would be easier, and less invasive (the shark would still be usable with manual control). I also liked the idea of keeping the transmitter as a manual override, in case my control program didn't work right. Finally, I liked the idea of keeping a Linux machine in the loop -- the shark would actually be controlled by Linux, not just by the Arduino.

So the first thing I did was take the transmitter apart (4 Philips screws). Inside are 4 pushbuttons, for right, left, up, and down, and the circuit board is nice and simple. Whew -- this might be doable!
[Air Swimmers IR transmitter, back]

Four more screws and I had access to the back of the board, which was equally simple. Now I could get my voltmeter on the contacts while I pushed buttons.

It turned out the contacts (indicated with arrows on the photo) on the downstream side of each switch were normally high (4.5 volts -- the transmitter uses 3 AAA batteries). When I pushed the button, the contact went to ground. Okay, so what I needed was some way for the Arduino to ground those contacts at will.

First I needed to solder some wires to the contacts. (How do you tell which side of the pushbutton is the one you need to solder? Well, one side changes voltage when you press the button, and the other side stays constant. The one that changes is the one you need to connect to the Arduino, so the Arduino can change it too.)

I figured I needed 6 wires: ground, power, and one for each switch. (It turned out I didn't need the power wire, but I figured it didn't hurt to include it just in case.) I wanted to have a nice small connector on the side of the transmitter, but I couldn't find any 6-pin connectors that didn't look big and bulky, so I gave up and decided I'd just let my ribbon cable dangle from the transmitter. If I got a nice multi-colored one, maybe it would look festive.

I couldn't find any 6-conductor ribbon cable, so I got a wider one and separated 6 wires from the rest. Then I soldered the six wires to the appropriate places (marked by arrows in the photo). On the other end, I tinned the six wires with solder so I could plug the stranded wires into my breadboard.

Simulating button presses

I've done enough reading to know of three ways to simulate a button press. You can put a relay between the two contacts of the switch; you can do the same thing, but with an optocoupler (opto-isolator) instead of a relay; or you can do some magic with a transistor. I was fuzzy on the transistor magic bit, so a relay sounded easiest.

I played around with a relay and a spare switch and convinced myself I knew how to wire them up. Then it was off to my local parts store to buy four matched relays small enough to fit on my little mini breadboard.

There followed a several-day fiasco wherein I bought lots of relays that turned out not to be suitable, and got increasingly frustrated at how large and clunky all the available relays were. There are smaller ones, but I couldn't get them to work. And I learned that relays mostly come without documentation on which pin does which, so there's a lot of experimenting with each new type.

Frustrated, I tried some optocouplers I'd bought on a whim last year. No dice ... couldn't get them to work either. [testing my transistor circuit]

Desperate, I turned to IRC, #arduino on Freenode. The folks there are mostly electronics wizards, and I'm sure my questions must have seemed very dumb, but they were patient with me, and pointed me toward a very simple circuit, LED4dummies, that was just what I needed. (They also suggested Wikipedia's Open collector article, but I found that less clear.)

It took me some experimenting with a transistor, an LED and a couple of resistors (I blew out a couple of transistors before I realized I had the R2 resistor in the wrong place) but eventually I got it working, and felt confident enough to try it with the real shark transmitter. The key was to simplify the circuit so it had no extra parts, then once it was working, add more parts to build it up to what I needed.
[Circuit for the Arduino-controlled Air Swimmers Shark]

At left, the circuit I ended up with. For each button, I have one transistor and one resistor (I don't need the second resistor from the LED4dummies circuit, since that was just to keep the LED from burning out). [testing my transistor circuit]

At right is the circuit assembled on a mini-breadboard on top of the proto-shield. Note that the ends of the ribbon cable are plugged in to a spare header I had lying around; a header makes a passable connector, so I can plug it in fairly easily right before a talk. The green and blue wires in the back are going to Arduino digital output pins 3 through 6 (leaving 0 and 1 for serial I/O). The red wires go from the transistors back to the ribbon cable wires that go to the shark's transmitter buttons.

The software side

Now I could make the shark think I'd pressed a button on its transmitter. How do I control that from Linux?

On the Arduino side, I wrote a simple program that reads and parses commands coming over the USB cable. So from the computer, I might send a line like L 300, and the Arduino would "press" the Left button for 300 milliseconds. I had already written something like this for a couple of other Arduino programs. That program is shark.pde.

On the Linux side, first I needed something that established a serial connection and sent commands to the Arduino. I wrote a Python class for that, shark.py. That let me send commands from the Python console to test the shark.

Then I needed something more visual, something I could show during a talk. In particular, the shark doesn't swim unless someone's pressing left, right, left, right buttons over and over. Of course I wanted the computer to handle that part.

So I wrote a little Python-GTK application that keeps the shark swimming, and lets me drag a mouse around to adjust its left/right up/down direction: sharkwindow.

Purely by coincidence, the week before SCALE, Scott Adams introduced a roboshark character: Dilbert, Jan 11 2012. Nice timing for the debut of my own roboshark!

Sadly, I don't have any photos or video of the shark in action. But if you're a LWN subscriber, there's an article on my talk with a couple of great pictures: Robots rampage (in a friendly way) at SCALE 10X. And you can see my slides and notes at Arduino notes.

Tags: , , , ,
[ 14:02 Jan 27, 2012    More hardware | permalink to this entry | ]

Tue, 17 Jan 2012

Converting HTML pages into PDF

I've long wanted a way of converting my HTML presentation slides to PDF. Mostly because conference organizers tend to freak out at slides in any format other than Open/Libre Office, Powerpoint or PDF. Too many times, I've submitted a tarball of my slides and discovered they weren't even listed on the conference site. (I ask you, in what way is a tarball more difficult to deal with than an .odp file?) Slide-sharing websites also have a limited set of formats they'll accept.

A year or so ago, I added screenshot capability to my webkit-based presentation program, Preso, do "screenshots", but I really needed PDF, not images.

Now, creating PDF from HTML shouldn't be that hard. Every browser has a print function that can print to a PDF file. So why is it so hard to create PDF from HTML in any kind of scriptable way?

After much searching and experimenting, I finally found a Python code snippet that worked: XHTML to PDF using PyGTK4 Webkit from Alex Dong. It uses Python-Qt, not GTK, so I can't integrate it into my Preso app, but that's okay -- a separate tool is just as good.

(I struggled to write an equivalent in PyGTK, but gave up due to the complete lack of documentation of Python-Webkit-GTK, and not much more for gtk.PrintOperation(). QWebView's documentation may not be as complete as I'd like, but at least there is some.)

Printing from QtWebView to QPrinter

Here are the important things I learned about QWebView from fiddling around with Alex's code to adapt it to what I needed, which is printing a list of pages to sequentially numbered files:

Things I learned about QPrinter():

Anyway, it's a little hacky with that empirical zoom factor ... but it works! The program is here: qhtmlprint: convert HTML to PDF using Qt Webkit.

And it does produce reasonable PDF, with the text properly vectorized, not just raster screenshots of each page.

Printing the slides in the right order

Terrific -- now I can feed a list of slides to qhtmlprint and get a bunch of PDF files back. How do I print the right slides?

My slides are listed in order in an array inside a Javascript file, one per line. If I grep .html navigate.js, I get a list like this:

    "arduino.html",
    "img.html?pix/arduinos/arduino-clones.jpg",
    "getting_started.html",
    "img.html?pix/projects/led.jpg",
    //"blink.html",
    "arduino-ide.html",

To pass that to qhtmlprint, I only need to remove the commented-out lines (the ones with //) and strip off the quotes and commas. I can do that all in one command with a grep and sed pipeline:

qhtmlprint ` fgrep .html navigate.js  | grep -v // | sed -e 's/",/"/' -e 's/"//g' `

And voiaà! I have a bunch of fileNNN.pdf files.

Creating a multi-page slide deck

Okay, great! Now how do I stick those files all together into one slide deck I can submit to conference organizers?

That part's easy -- Ghostscript can do it.

gs -dNOPAUSE -sDEVICE=pdfwrite -sOUTPUTFILE=slidedeck.pdf -dBATCH file*.pdf

And now slidedeck.pdf contains my whole presentation, ready to go.

Tags: , ,
[ 12:16 Jan 17, 2012    More programming | permalink to this entry | ]

Thu, 12 Jan 2012

HTML and Javascript Presentations

When I give talks that need slides, I've been using my Slide Presentations in HTML and JavaScript for many years. I uploaded it in 2007 -- then left it there, without many updates.

But meanwhile, I've been giving lots of presentations, tweaking the code, tweaking the CSS to make it display better. And every now and then I get reminded that a few other people besides me are using this stuff.

For instance, around a year ago, I gave a talk where nearly all the slides were just images. Silly to have to make a separate HTML file to go with each image. Why not just have one file, img.html, that can show different images? So I wrote some code that lets you go to a URL like img.html?pix/whizzyphoto.jpg, and it will display it properly, and the Next and Previous slide links will still work.

Of course, I tweak this software mainly when I have a talk coming up. I've been working lately on my SCALE talk, coming up on January 22: Fun with Linux and Devices (be ready for some fun Arduino demos!) Sometimes when I overload on talk preparation, I procrastinate by hacking the software instead of the content of the actual talk. So I've added some nice changes just in the past few weeks.

For instance, the speaker notes that remind me of where I am in the talk and what's coming next. I didn't have any way to add notes on image slides. But I need them on those slides, too -- so I added that.

Then I decided it was silly not to have some sort of automatic reminder of what the next slide was. Why should I have to put it in the speaker notes by hand? So that went in too.

And now I've done the less fun part -- collecting it all together and documenting the new additions. So if you're using my HTML/JS slide kit -- or if you think you might be interested in something like that as an alternative to Powerpoint or Libre Office Presenter -- check out the presentation I have explaining the package, including the new features.

You can find it here: Slide Presentations in HTML and JavaScript

Tags: , , , , ,
[ 21:08 Jan 12, 2012    More speaking | permalink to this entry | ]

Sun, 08 Jan 2012

Parsing HTML in Python

I've been having (mis)adventures learning about Python's various options for parsing HTML.

Up until now, I've avoided doing any HTMl parsing in my RSS reader FeedMe. I use regular expressions to find the places where content starts and ends, and to screen out content like advertising, and to rewrite links. Using regexps on HTML is generally considered to be a no-no, but it didn't seem worth parsing the whole document just for those modest goals.

But I've long wanted to add support for downloading images, so you could view the downloaded pages with their embedded images if you so chose. That means not only identifying img tags and extracting their src attributes, but also rewriting the img tag afterward to point to the locally stored image. It was time to learn how to parse HTML.

Since I'm forever seeing people flamed on the #python IRC channel for using regexps on HTML, I figured real HTML parsing must be straightforward. A quick web search led me to Python's built-in HTMLParser class. It comes with a nice example for how to use it: define a class that inherits from HTMLParser, then define some functions it can call for things like handle_starttag and handle_endtag; then call self.feed(). Something like this:

from HTMLParser import HTMLParser

class MyFancyHTMLParser(HTMLParser):
  def fetch_url(self, url) :
    request = urllib2.Request(url)
    response = urllib2.urlopen(request)
    link = response.geturl()
    html = response.read()
    response.close()
    self.feed(html)   # feed() starts the HTMLParser parsing

  def handle_starttag(self, tag, attrs):
    if tag == 'img' :
      # attrs is a list of tuples, (attribute, value)
      srcindex = self.has_attr('src', attrs)
      if srcindex < 0 :
        return   # img with no src tag? skip it
      src = attrs[srcindex][1]
      # Make relative URLs absolute
      src = self.make_absolute(src)
      attrs[srcindex] = (attrs[srcindex][0], src)

    print '<' + tag
    for attr in attrs :
      print ' ' + attr[0]
      if len(attr) > 1 and type(attr[1]) == 'str' :
        # make sure attr[1] doesn't have any embedded double-quotes
        val = attr[1].replace('"', '\"')
        print '="' + val + '"')
    print '>'

  def handle_endtag(self, tag):
    self.outfile.write('</' + tag.encode(self.encoding) + '>\n')

Easy, right? Of course there are a lot more details, but the basics are simple.

I coded it up and it didn't take long to get it downloading images and changing img tags to point to them. Woohoo! Whee!

The bad news about HTMLParser

Except ... after using it a few days, I was hitting some weird errors. In particular, this one:
HTMLParser.HTMLParseError: bad end tag: ''

It comes from sites that have illegal content. For instance, stories on Slate.com include Javascript lines like this one inside <script></script> tags:
document.write("<script type='text/javascript' src='whatever'></scr" + "ipt>");

This is technically illegal html -- but lots of sites do it, so protesting that it's technically illegal doesn't help if you're trying to read a real-world site.

Some discussions said setting self.CDATA_CONTENT_ELEMENTS = () would help, but it didn't.

HTMLParser's code is in Python, not C. So I took a look at where the errors are generated, thinking maybe I could override them. It was easy enough to redefine parse_endtag() to make it not throw an error (I had to duplicate some internal strings too). But then I hit another error, so I redefined unknown_decl() and _scan_name(). And then I hit another error. I'm sure you see where this was going. Pretty soon I had over 100 lines of duplicated code, and I was still getting errors and needed to redefine even more functions. This clearly wasn't the way to go.

Using lxml.html

I'd been trying to avoid adding dependencies to additional Python packages, but if you want to parse real-world HTML, you have to. There are two main options: Beautiful Soup and lxml.html. Beautiful Soup is popular for large projects, but the consensus seems to be that lxml.html is more error-tolerant and lighter weight.

Indeed, lxml.html is much more forgiving. You can't handle start and end tags as they pass through, like you can with HTMLParser. Instead you parse the HTML into an in-memory tree, like this:

  tree = lxml.html.fromstring(html)

How do you iterate over the tree? lxml.html is a good parser, but it has rather poor documentation, so it took some struggling to figure out what was inside the tree and how to iterate over it.

You can visit every element in the tree with

for e in tree.iter() :
  print e.tag

But that's not terribly useful if you need to know which tags are inside which other tags. Instead, define a function that iterates over the top level elements and calls itself recursively on each child.

The top of the tree itself is an element -- typically the <html></html> -- and each element has .tag and .attrib. If it contains text inside it (like a <p> tag), it also has .text. So to make something that works similarly to HTMLParser:

def crawl_tree(tree) :
  handle_starttag(tree.tag, tree.attrib)
  if tree.text :
    handle_data(tree.text)
  for node in tree :
    crawl_tree(node)
  handle_endtag(tree.tag)

But wait -- we're not quite all there. You need to handle two undocumented cases.

First, comment tags are special: their tag attribute, instead of being a string, is <built-in function Comment> so you have to handle that specially and not assume that tag is text that you can print or test against.

Second, what about cases like <p>Here is some <i>italicised</i> text.</p> ? in this case, you have the p tag, and its text is "Here is some ". Then the p has a child, the i tag, with text of "italicised". But what about the rest of the string, " text."?

That's called a tail -- and it's the tail of the adjacent i tag it follows, not the parent p tag that contains it. Confusing!

So our function becomes:

def crawl_tree(tree) :
  if type(tree.tag) is str :
    handle_starttag(tree.tag, tree.attrib)
    if tree.text :
      handle_data(tree.text)
    for node in tree :
      crawl_tree(node)
    handle_endtag(tree.tag)
  if tree.tail :
    handle_data(tree.tail)

See how it works? If it's a comment (tree.tag isn't a string), we'll skip everything -- except the tail. Even a comment might have a tail:
<p>Here is some <!-- this is a comment --> text we want to show.</p>
so even if we're skipping comment we need its tail.

I'm sure I'll find other gotchas I've missed, so I'm not releasing this version of feedme until it's had a lot more testing. But it looks like lxml.html is a reliable way to parse real-world pages. It even has a lot of convenience functions like link rewriting that you can use without iterating the tree at all. Definitely worth a look!

Tags: , ,
[ 15:04 Jan 08, 2012    More programming | permalink to this entry | ]

Tue, 03 Jan 2012

Open the X selection in a browser window, from any desktop

Like most Linux users, I use virtual desktops. Normally my browser window is on a desktop of its own.

Naturally, it often happens that I encounter a link I'd like to visit while I'm on a desktop where the browser isn't visible. From some apps, I can click on the link and have it show up. But sometimes, the link is just text, and I have to select it, change to the browser desktop, paste the link into firefox, then change desktops again to do something else while the link loads.

So I set up a way to load whatever's in the X selection in firefox no matter what desktop I'm on.

In most browsers, including firefox, you can tell your existing browser window to open a new link from the command line: firefox http://example.com/ opens that link in your existing browser window if you already have one up, rather than starting another browser. So the trick is to get the text you've selected.

At first, I used a program called xclip. You can run this command: firefox `xclip -o` to open the selection. That worked okay at first -- until I hit my first URL in weechat that was so long that it was wrapped to the next line. It turns out xclip does odd things with multi-line output; depending on whether it thinks the output is a terminal or not, it may replace the newline with a space, or delete whatever follows the newline. In any case, I couldn't find a way to make it work reliably when pasted into firefox.

After futzing with xclip for a little too long, trying to reverse-engineer its undocumented newline behavior, I decided it would be easier just to write my own X clipboard app in Python. I already knew how to do that, and it's super easy once you know the trick:

mport gtk
primary = gtk.clipboard_get(gtk.gdk.SELECTION_PRIMARY)
if primary.wait_is_text_available() :
    print primary.wait_for_text()

That just prints it directly, including any newlines or spaces. But as long as I was writing my own app, why not handle that too?

It's not entirely necessary on Firefox: on Linux, Firefox has some special code to deal with pasting multi-line URLs, so you can copy a URL that spans multiple lines, middleclick in the content area and things will work. On other platforms, that's disabled, and some Linux distros disable it as well; you can enable it by going to about:config and searching for single, then setting the preference editor.singlelinepaste.pasteNewlines to 2.

However, it was easy enough to make my Python clipboard app do the right thing so it would work in any browser. I used Python's re (regular expressions) module:

#!/usr/bin/env python

import gtk
import re

primary = gtk.clipboard_get(gtk.gdk.SELECTION_PRIMARY)

if not primary.wait_is_text_available() :
    sys.exit(0)
s = primary.wait_for_text()

# eliminate newlines, and any spaces immediately following a newline:
print re.sub(r'[\r\n]+ *', '', s)

That seemed to work fine, even on long URLs pasted from weechat with newlines and spaces, like that looked like

http://example.com/long-
    url.html

All that was left was binding it so I could access it from anywhere. Of course, that varies depending on your desktop/window manager. In Openbox, I added two items to my desktop menu in menu.xml:

  <item label="open selection in Firefox">
    <action name="Execute"><execute>sh -c 'firefox `xclip -o`'</execute></action>
  </item>
  <item label="open selection in new tab">
    <action name="Execute"><execute>sh -c 'firefox -new-tab `xclip -o`'</execute></action>
  </item>

I also added some code in rc.xml inside <context name="Desktop">, so I can middle-click or control-middle-click on the desktop to open a link in the browser:

      <mousebind button="Middle" action="Press">
        <action name="Execute">
          <execute>sh -c 'firefox `pyclip`'</execute>
        </action>
      </mousebind>
      <mousebind button="C-Middle" action="Press">
        <action name="Execute">
          <execute>sh -c -new-tab 'firefox `pyclip`'</execute>
        </action>
      </mousebind>

I set this up maybe two hours ago and I've probably used it ten or fifteen times already. This is something I should have done long ago!

Tags: , , ,
[ 22:37 Jan 03, 2012    More linux | permalink to this entry | ]