Shallow Thoughts : : programming

Akkana's Musings on Open Source Computing and Technology, Science, and Nature.

Sat, 04 May 2024

Creating an Image with Wrapped Text using the Python Imaging Library

I stumbled onto the page for this year's Asimov's Magazine Readers' Award Finalists. They offer all the stories right there -- but only as PDF. I prefer reading fiction on my ebook reader (a Kobo Clara with 6" screen), away from the computer. I spend too much time sitting at the computer as it is. But trying to read a PDF on a 6" screen is just painful.

The open-source ebook program Calibre has a command-line program called ebook-convert that can convert some PDF to epub. It did an okay job in this case — except that the PDFs had the wrong author name (they all have the same author, so I'm guessing it's the name of the person who prepared the PDFs for Asimov's), and the wrong title information (or maybe just no title), and ebook-convert compounded that error by generating cover images for each work that had the wrong title and author.

I went through the files and fixed each one's title and author metadata using my epubtag.py Python script. But what about the cover images? I wasn't eager to spend the time GIMPing up a cover image by hand for each of the stories.

Read more ...

Tags: , ,
[ 13:52 May 04, 2024    More programming | permalink to this entry | ]

Sun, 10 Mar 2024

How Common is Easter in March?

"Easter is March 31 this year," my husband said. "I think that's rare, having Easter in March."

"I guess so," I said.

"There's a Cray somewhere that they use to calculate the date each year," he joked.

And of course that made me want to find out if Easters in March really are rare.

Read more ...

Tags: ,
[ 12:49 Mar 10, 2024    More programming | permalink to this entry | ]

Fri, 22 Sep 2023

Drag-and-Drop in Python Qt6

I had a need for a window to which I could drag and drop URLs.

I don't use drag-and-drop much, since I prefer using the commandline rather than a file manager and icon-studded desktop. Usually when I need some little utility and can't immediately find what I need, I whip up a little Python script.

This time, it wasn't so easy. Python has a GUI problem (as does open source in general): there are quite a few options, like TkInter, Qt, GTK, WxWidgets and assorted others, and they all have different strengths and (especially) weaknesses.

Drag-and-drop turns out to be something none of them do very well.

Read more ...

Tags: , ,
[ 18:45 Sep 22, 2023    More programming | permalink to this entry | ]

Thu, 07 Sep 2023

Los Alamos Voting Data on a Folium Choropleth Map

Somebody in a group I'm in has commented more than once that White Rock is a hotbed of Republicanism whereas Los Alamos leans Democratic. (For outsiders, our tiny county has two geographically-distinct towns in it, with separate zip codes, though officially they're both part of Los Alamos township which covers all of Los Alamos county. White Rock is about half the size of Los Alamos.)

After I'd heard her say it a couple times, I got curious. Was it true? I asked her for a reference, but she didn't have one. I decided to find out.

Read more ...

Tags: , , ,
[ 11:58 Sep 07, 2023    More programming | permalink to this entry | ]

Thu, 23 Jun 2022

Clicking through a Translucent Image Window

[transparent image viewer overlayed on top of topo map]

Five years ago, I wrote about Clicking through a translucent window: using X11 input shapes and how I used a translucent image window that allows click-through, positioned on top of PyTopo, to trace an image of an old map and create tracks or waypoints.

But the transimageviewer.py app that I wrote then was based on GTK2, which is now obsolete and has been removed from most Linux distro repositories. So when I found myself wanting GIS to help investigate a growing trail controversy in Pueblo Canyon, I discovered I didn't have a usable click-through image viewer.

Read more ...

Tags: , , , , ,
[ 19:08 Jun 23, 2022    More programming | permalink to this entry | ]

Fri, 03 Dec 2021

Importing Cookies from a Firefox Profile in Python

I wrote at length about my explorations into selenium to fetch stories from the New York Times (as a subscriber). But I mentioned in Part III that there was a much easier way to fetch those stories, as long as the stories didn't need JavaScript.

That way is to use normal file fetching (using urllib or requests), but with a CookieJar object containing the cookies from a Firefox session where I'd logged in.

Read more ...

Tags: , , , ,
[ 12:22 Dec 03, 2021    More programming | permalink to this entry | ]

Sat, 20 Nov 2021

Wikipedia: All Roads Lead to ... Philosophy?

At a recent LUG meeting, we were talking about various uses for web scraping, and someone brought up a Wikipedia game: start on any page, click on the first real link, then repeat on the page that comes up. The claim is that this chain always gets to Wikipedia's page on Philosophy.

We tried a few rounds, and sure enough, every page we tried did eventually get to Philosophy, usually via languages, which goes to communication, goes to discipline, action, intention, mental, thought, idea, philosophy.

It's a perfect game for a discussion of scraping. It should be an easy exercise to write a scraper to do this, right?

Read more ...

Tags: , , , ,
[ 19:31 Nov 20, 2021    More programming | permalink to this entry | ]

Thu, 11 Nov 2021

Selenium: Handling Timeouts and Errors

This is part 3 of my selenium exploration trying to fetch stories from the NY Times ((as a subscriber).

At the end of Part II, selenium was running on a server with the minimal number of X and GTK libraries installed.

But now that it can run unattended, there's nother problem: there are all kinds of ways this can fail, and your script needs to handle those errors somehow.

Before diving in, I should mention that for my original goal, fetching stories from the NY Times as a subscriber, it turned out I didn't need selenium after all. Since handling selenium errors turned out to be so brittle (as I'll describe in this article), I'm now using requests combined with a Python CookieJar. I'll write about that in a future article. Meanwhile ...

Handling Errors and Timeouts

Timeouts are a particular problem with selenium, because there doesn't seem to be any reliable way to change them so the selenium script doesn't hang for ridiculously long periods.

Read more ...

Tags: , , ,
[ 12:07 Nov 11, 2021    More programming | permalink to this entry | ]