Shallow Thoughts : : programming
Akkana's Musings on Open Source Computing and Technology, Science, and Nature.
Fri, 22 Sep 2023
I had a need for a window to which I could drag and drop URLs.
I don't use drag-and-drop much, since I prefer using the commandline
rather than a file manager and icon-studded desktop.
Usually when I need some little utility and can't immediately find
what I need, I whip up a little Python script.
This time, it wasn't so easy. Python has a GUI problem (as does open
source in general): there are quite a few options, like TkInter, Qt, GTK,
WxWidgets and assorted others, and they all have different strengths and
(especially) weaknesses.
Drag-and-drop turns out to be something none of them do very well.
Read more ...
Tags: programming, python, qt
[
18:45 Sep 22, 2023
More programming |
permalink to this entry |
]
Thu, 07 Sep 2023
Somebody in a group I'm in has commented more than once that White
Rock is a hotbed of Republicanism whereas Los Alamos leans Democratic.
(For outsiders, our tiny county has two geographically-distinct towns
in it, with separate zip codes, though officially they're both part of
Los Alamos township which covers all of Los Alamos county.
White Rock is about half the size of Los Alamos.)
After I'd heard her say it a couple times, I got curious. Was it true?
I asked her for a reference, but she didn't have one. I decided to
find out.
Read more ...
Tags: mapping, GIS, programming, python
[
11:58 Sep 07, 2023
More programming |
permalink to this entry |
]
Thu, 23 Jun 2022
Five years ago, I wrote about
Clicking through a translucent window: using X11 input shapes
and how I used a translucent image window that allows click-through,
positioned on top of PyTopo, to trace an image of an old map and
create tracks or waypoints.
But the transimageviewer.py app that I wrote then was based on
GTK2, which is now obsolete and has been removed from most Linux
distro repositories. So when I found myself wanting GIS to help
investigate a
growing trail controversy in Pueblo Canyon,
I discovered I didn't have a usable click-through image viewer.
Read more ...
Tags: programming, X11, python, mapping, GIS
[
19:08 Jun 23, 2022
More programming |
permalink to this entry |
]
Fri, 03 Dec 2021
I wrote at length about my explorations into
selenium
to fetch stories from the New York Times (as a subscriber).
But I mentioned in Part III that there was a much easier way
to fetch those stories, as long as the stories didn't need JavaScript.
That way is to use normal file fetching (using urllib or requests),
but with a CookieJar object containing the cookies from a Firefox
session where I'd logged in.
Read more ...
Tags: programming, python, cookies, firefox, scraping
[
12:22 Dec 03, 2021
More programming |
permalink to this entry |
]
Sat, 20 Nov 2021
At a recent LUG meeting, we were talking about various uses for web
scraping, and someone brought up a Wikipedia game: start on any page,
click on the first real link, then repeat on the page that comes up.
The claim is that this chain always gets to Wikipedia's page on
Philosophy.
We tried a few rounds, and sure enough, every page we tried did
eventually get to Philosophy, usually via languages, which goes to
communication, goes to discipline, action, intention, mental, thought,
idea, philosophy.
It's a perfect game for a discussion of scraping. It should be an easy
exercise to write a scraper to do this, right?
Read more ...
Tags: programming, python, scraping, beautiful soup, wikipedia
[
19:31 Nov 20, 2021
More programming |
permalink to this entry |
]
Thu, 11 Nov 2021
This is part 3 of my selenium exploration trying to fetch stories
from the NY Times ((as a subscriber).
At the end of Part II, selenium was running on a server with the
minimal number of X and GTK libraries installed.
But now that it can run unattended, there's nother problem:
there are all kinds of ways this can fail,
and your script needs to handle those errors somehow.
Before diving in, I should mention that for my original goal,
fetching stories from the NY Times as a subscriber,
it turned out I didn't need selenium after all.
Since handling selenium errors turned out to be so brittle
(as I'll describe in this article), I'm now using requests
combined with a Python CookieJar. I'll write about that in a
future article. Meanwhile ...
Handling Errors and Timeouts
Timeouts are a particular problem with selenium,
because there doesn't seem to be any reliable way to change them
so the selenium script doesn't hang for ridiculously long periods.
Read more ...
Tags: programming, python, scraping, selenium
[
12:07 Nov 11, 2021
More programming |
permalink to this entry |
]
Sun, 07 Nov 2021
This is part 2 of my selenium exploration trying to fetch stories
from the NY Times ((as a subscriber).
When we left off, I was learning
the
basics of selenium in order to fetch stories (as a subscriber)
from the New York Times. Fetching stories was working properly,
and all that remained was to put it in an automated script, then
move it to a server where it could run automatically without my
desktop machine needing to be on.
Unfortunately, that turned out to be the hardest part of the problem.
Read more ...
Tags: programming, python, scraping, selenium
[
12:18 Nov 07, 2021
More programming |
permalink to this entry |
]
Tue, 02 Nov 2021
This is part 1 of my selenium exploration.
At the New Mexico GNU & Linux User Group,
currently meeting virtually on Jitsi, someone expressed interest in scraping
websites. Since I do quite a bit of scraping, I offered to give
a tutorial on scraping with the Python module
BeautifulSoup.
"What about selenium?" he asked. Sorry, I said, I've never needed
selenium enough to figure it out.
But then a week later, I found I did have a need.
Read more ...
Tags: programming, python, scraping, selenium
[
19:58 Nov 02, 2021
More programming |
permalink to this entry |
]