Shallow Thoughts : : web

Akkana's Musings on Open Source Computing and Technology, Science, and Nature.

Sat, 29 Jan 2022

View or Reset All Your Firefox Zoom Settings

Firefox's zoom settings are useful. You can zoom in on a page with Ctrl-+ (actually Ctrl-+ on a US-English keyboard), or out with Ctrl--.

Useful, that is, until you start noticing that lots of pages you visit have weirdly large or small font sizes, and it turns out that Firefox is remembering a Zoom setting you used on that site half a year ago on a different monitor.

Whenever you zoom, Firefox remembers that site, and uses that zoom setting any time you go to that site forevermore (unless you zoom back out).

Now that I'm using the same laptop in different modes — sometimes plugged into a monitor, sometimes using its own screen — that has become a problem.

Read more ...

Tags: , , ,
[ 18:04 Jan 29, 2022    More tech/web | permalink to this entry | ]

Sun, 12 Dec 2021

Battling Signup Spam on the Bill Tracker

I've spent a lot of the past week battling Russian spammers on the New Mexico Bill Tracker.

The New Mexico legislature just began a special session to define the new voting districts, which happens every 10 years after the census. When new legislative sessions start, the BillTracker usually needs some hand-holding to make sure it's tracking the new session. (I've been working on code to make it notice new sessions automatically, but it's not fully working yet). So when the session started, I checked the log files...

and found them full of Russian spam.

Specifically, what was happening was that a bot was going to my new user registration page and creating new accounts where the username was a paragraph of Cyrillic spam.

Read more ...

Tags: , , , , , ,
[ 18:50 Dec 12, 2021    More tech/web | permalink to this entry | ]

Mon, 15 Nov 2021

Removing Bad Autocompletes from Firefox's Location Bar

A priest, a minister, and a rabbit walk into a bar.
The bartender asks the rabbit what he'll have to drink.
"How should I know?" says the rabbit. "I'm only here because of autocomplete."

Firefox folks like to call the location bar/URL bar the "awesomebar" because of the suggestions it makes. Sometimes, those suggestions are pretty great; there are a lot of sites I don't bother to bookmark because I know they will show up as the first suggestion.

Other times, the "awesomebar" not so awesome. It gets stuck on some site I never use, and there's seemingly no way to make Firefox forget that site.

Read more ...

Tags: , ,
[ 16:54 Nov 15, 2021    More tech/web | permalink to this entry | ]

Fri, 06 Aug 2021

Firefox Broken on Local Domains

I maintain quite a few domains, both domains I own and domains belonging to various nonprofits I belong to. For testing these websites, I make virtual domains in apache, choosing an alias for each site. For instance, for the LWVNM website, the apache site file has

<VirtualHost *:80>
    ServerName lwvlocal
and my host table, /etc/hosts, has
127.0.0.1    localhost lwvlocal
(The localhost line in my host table has entries for all the various virtual hosts I use, not just this one).

That all used to work fine. If I wanted to test a new page on the LWVNM website, I'd go to Firefox's urlbar and type something like lwvlocal/newpage.html and it would show me the new page, which I could work on until it was time to push it to the web server.

A month or so ago, a new update to Firefox broke that.

Read more ...

Tags: ,
[ 13:34 Aug 06, 2021    More tech/web | permalink to this entry | ]

Mon, 19 Jul 2021

Hack to fix Firefox 90.0's new invisible menu

Yesterday I started up my browser and discovered that I had no menu.

[Invisible Firefox menubar]

I understand that Mozilla wants us not to use the menu ... because why would anyone want to use any of Firefox's zillions of useful features that aren't available through the hamburger menu? ... but they've always made it possible to show a menubar if you really want one. Right-click in the area where the tabs are, and there's an option for Menu Bar that you can turn on.

And that option was still there, and the space above the tabs where it should show up was still taking up space ... there just weren't any menu buttons to click on.

Except they were. I tried clicking near the left edge and a familiar File menu popped up. Aha! The menubar is still there; it's just invisible. (In the screenshot above, if you look hard you can actually see the menu items, barely; in the theme I was actually using, which got uninstalled while I flailed around trying to fix the problem, they were much less visible.)

Read more ...

Tags:
[ 18:46 Jul 19, 2021    More tech/web | permalink to this entry | ]

Sat, 08 Aug 2020

U is for Unreliable UI (or: Why Firefox's "Do this automatically this from now on" checkbox is so flaky, and how to work around it)

It's been a frustration with Firefox for years. You click on a link and get the "What should Firefox do with this file?" dialog, even though it's a file type you view all the time -- PDF, say, or JPEG. You click "View in browser" or "Save file" or whatever ... then you check the "Do this automatically for files like this from now on" checkbox, thinking, I'm sure I checked this last time.

Then a few minutes later, you go to a file of the exact same time, and you get the dialog again. That damn checkbox is like the button on street crossings or elevators: a no-op to make you think you're doing something.

I never tried to get to the bottom of why this happens with some PDFs and not others, some JPGs but not others. But Los Alamos puts their government meetings on a site called Legistar. Legistar does everything as PDF -- and those PDFs all trigger this Firefox bug, prompting for a download rather than displaying in Firefox's PDF viewer.

Read more ...

Tags: , ,
[ 16:38 Aug 08, 2020    More tech/web | permalink to this entry | ]

Sat, 08 Feb 2020

Displaying Quotes on a Kiosk -- and Javascript Memory Leaks

The LWV had a 100th anniversary celebration earlier this week. In New Mexico, that included a big celebration at the Roundhouse. One of our members has collected a series of fun facts that she calls "100-Year Minutes". You can see them at lwvnm.org. She asked me if it would be possible to have them displayed somehow during our display at the Roundhouse.

Of course! I said. "Easy, no problem!" I said.

Famous last words.

There are two parts: first, display randomly (or sequentially) chosen quotes with large text in a fullscreen window. Second, set up a computer (the obvious choice is a Raspberry Pi) run the kiosk automatically. This article only covers the first part; I'll write about the Raspberry Pi setup separately.

A Simple Plaintext Kiosk Python Script

When I said "easy" and "no problem", I was imagining writing a little Python program: get text, scale it to the screen, loop. I figured the only hard part would be the scaling. the quotes aren't all the same length, but I want them to be easy to read, so I wanted each quote displayed in the largest font that would let the quote fill the screen.

Indeed, for plaintext it was easy. Using GTK3 in Python, first you set up a PangoCairo layout (Cairo is the way you draw in GTK3, Pango is the font/text rendering library, and a layout is Pango's term for a bunch of text to be rendered). Start with a really big font size, ask PangoCairo how large the layout would render, and if it's so big that it doesn't fit in the available space, reduce the font size and try again. It's not super elegant, but it's easy and it's fast enough. It only took an hour or two for a working script, which you can see at quotekiosk.py.

But some of the quotes had minor HTML formatting. GtkWebkit was orphaned several years ago and was never available for Python 3; the only Python 3 option I know of for displaying HTML is Qt5's QtWebEngine, which is essentially a fully functioning browser window.

Which meant that it seeming made more sense to write the whole kiosk as a web page, with the resizing code in JavaScript. I say "seemingly"; it didn't turn out that way.

JavaScript: Resizing Text to Fit Available Space

The hard part about using JavaScript was the text resizing, since I couldn't use my PangoCairo resizing code.

Much web searching found lots of solutions that resize a single line to fit the width of the screen, plus a lot of hand-waving suggestions that didn't work. I finally found a working solution in a StackOverflow thread: Fit text perfectly inside a div (height and width) without affecting the size of the div. The only one of the three solutions there that actually worked was the jQuery one. It basically does the same thing my original Python script did: check element.scrollHeight and if it overflows, reduce the font size and try again.

I used the jquery version for a little while, but eventually rewrote it to pure javascript so I wouldn't have to keep copying jquery-min.js around.

JS Timers on Slow Machines

There are two types of timers in Javascript: setTimeout, which schedules something to run once N seconds from now, and setInterval, which schedules something to run repeatedly every N seconds. At first I thought I wanted setInterval, since I want the kiosk to keep running, changing its quote every so often.

I coded that, and it worked okay on my laptop, but failed miserably on the Raspberry Pi Zero W. The Pi, even with a lightweight browser like gpreso (let alone chromium), takes so long to load a page and go through the resize-and-check-height loop that by the time it has finally displayed, it's about ready for the timer to fire again. And because it takes longer to scale a big quote than a small one, the longest quotes give you the shortest time to read them.

So I switched to setTimeout instead. Choose a quote (since JavaScript makes it hard to read local files, I used Python to read all the quotes in, turn them into a JSON list and write them out to a file that I included in my JavaScript code), set the text color to the background color so you can't see all the hacky resizing, run the resize loop, set the color back to the foreground color, and only then call setTimeout again:

function newquote() {
    // ... resizing and other slow stuff here

    setTimeout(newquote, 30000);
}

// Display the first page:
newquote();

That worked much better on the Raspberry Pi Zero W, so I added code to resize images in a similar fashion, and added some fancy CSS fade effects that it turned out the Pi was too slow to run, but it looks nice on a modern x86 machine. The full working kiosk code is quotekioska>).

Memory Leaks in JavaScript's innerHTML

I ran it for several hours on my development machine and it looked great. But when I copied it to the Pi, even after I turned off the fades (which looked jerky and terrible on the slow processor), it only ran for ten or fifteen minutes, then crashed. Every time. I tried it in several browsers, but they all crashed after running a while.

The obvious culprit, since it ran fine for a while then crashed, was a memory leak. The next step was to make a minimal test case.

I'm using innerHTML to change the kiosk content, because it's the only way I know of to parse and insert a snippet of HTML that may or may not contain paragraphs and other nodes. This little test page was enough to show the effect:

<h1>innerHTML Leak</h1>

<p id="thecontent">
</p>

<script type="text/javascript">
var i = 0;
function changeContent() {
    var s = "Now we're at number " + i;
    document.getElementById("thecontent").innerHTML = s;
    i += 1;

    setTimeout(changeContent, 2000);
}

changeContent();
</script>

Chromium has a nice performance recording tool that can show you memory leaks. (Firefox doesn't seem to have an equivalent, alas.)

[Chrome performance graph showing innerHTML node leak] To test a leak, go to More Tools > Developer Tools and choose the Performance tab. Load your test page, then click the Record button. Run it for a while, like a couple of minutes, then stop it and you'll see a graph like this (click on the image for a full-size version).

Both the green line, Nodes, and the blue line, JS Heap, are going up. But if you run it for longer, say, ten minutes, the garbage collector eventually runs and the JS Heap line drops back down. The Nodes line never does: the node count just continues going up and up and up no matter how long you run it.

So it looks like that's the culprit: setting innerHTML adds a new node (or several) each time you call it, and those nodes are never garbage collected. No wonder it couldn't run for long on the poor Raspberry Pi Zero with 512Gb RAM (the Pi 3 with 1Gb didn't fare much better).

It's weird that all browsers would have the same memory leak; maybe something about the definition of innerHTML causes it. I'm not enough of a Javascript expert to know, and the experts I was able to find didn't seem to know anything about either why it happened, or how to work around it.

Python html2text

So I gave up on JavaScript and went back to my original Python text kiosk program. After reading in an HTML snippet, I used the Python html2text module to convert the snippet to text, then displayed it. I added image resizing using GdkPixbuf and I was good to go.

quotekiosk.py ran just fine throughout the centennial party, and no one complained about the formatting not being fancy enough. A happy ending, complete with cake and lemonade. But I'm still curious about that JavaScript leak, and whether there's a way to work around it. Anybody know?

Tags: , , ,
[ 18:48 Feb 08, 2020    More tech/web | permalink to this entry | ]

Sat, 01 Feb 2020

Migrate a sqlite3 Flask App to Postgresql

The New Mexico legislature is in session again, which means the New Mexico Bill Tracker I wrote last year is back in season. But I guess the word has gotten out, because this year, I started seeing a few database errors. Specifically, "sqlite3.OperationalError: database is locked".

It turns out that even read queries on an sqlite3 database inside flask and sqlalchemy can sometimes keep the database open indefinitely. Consider something like:

    userbills = user.get_bills()    # this does a read query

    # Do some slow operations that don't involve the database at all
    for bill in userbills:
        slow_update_involving_web_scraping(bill)

    # Now bills are all updated; add and commit them.
    # Here's where the write operations start.
    for bill in userbills:
        db.session.add(bill)
    db.session.commit()

I knew better than to open a write query that might keep the database open during all those long running operations. But apparently, when using sqlite3, even the initial query of the database to get the user's bill list opens the database and keeps it open ... until when? Can you close it manually, then reopen it when you're ready? Does it help to call db.session.commit() after the read query? No one seems to know, and it's not obvious how to test to find out.

I've suspected for a long time that sqlite was only a temporary solution. While developing the billtracker, I hit quite a few difficulties where the answer turned out to be "well, this would be easy in a real database, but sqlite doesn't support that". I figured I'd eventually migrate to postgresql. But I'm such a database newbie that I'd been putting it off.

And rightly so. It turns out that migrating an existing database from sqlite3 to postgresql isn't something that gets written about much; I really couldn't find any guides on it. Apparently everybody but me just chooses the right database to begin with? Anyway, here are the steps on Debian. Obviously, install postgresql first.

Create a User and a Database

Postgresql has its own notion of users, which you need to create. At least on Debian, the default is that if you create a postgres user named martha, then the Linux user martha on the same machine can access databases that the postgres user martha has access to. This is controlled by the "peer" auth method, which you can read about in the postgresql documentation on pg_hba.conf.

First su to the postgres Linux user and run psql:

$ sudo su - postgres
$ psql

Inside psql, create a postgresql user with the same name as your flask user, and create a database for that user:

CREATE USER myflaskuser WITH PASSWORD 'password';
ALTER ROLE myflaskuser SET client_encoding TO 'utf8';
ALTER ROLE myflaskuser SET default_transaction_isolation TO 'read committed';
ALTER ROLE myflaskuser SET timezone TO 'UTC';

CREATE DATABASE dbname;
GRANT ALL PRIVILEGES ON DATABASE dbname TO myflaskuser;

If you like, you can also create a user yourusername and give it access to the same database, to make debugging easier.

With the database created, the next step is to migrate the old data from the sqlite database.

pgloader (if you have a very recent pgloader)

Using sqlalchemy in my flask app meant that I could use flask db upgrade to create the database schema in any database I chose. It does a lovely job of creating an empty database. Unfortunately, that's no help if you already have an existing database full of user accounts.

Some people suggested exporting data in either SQL or CSV format, then importing it into postgresql. Bad idea. There are many incompatibilities between the two databases: identifiers that work in sqlite but not in postgresql (like "user", which is a reserved word in postgres but a common table name in flask-based apps), capitalization of column names, incompatible date formats, and probably many more.

A program called pgloader takes care of many (but not all) of the incompatibilities. Create a file -- I'll call it migrate.pgloader -- like this:

load database
    from 'latest-sqlite-file.db'
    into postgresql:///new_db_name

with include drop, quote identifiers, create tables, create indexes, reset sequences

set work_mem to '16MB', maintenance_work_mem to '512 MB';

Then, from a Linux user who has access to the database (e.g. the myflaskuser you created earlier), run pgloader migrate.pgloader.

That worked nicely on my Ubuntu 19.10 desktop, which has pgloader 3.6.1. It failed utterly on the server, which is running Debian stable and pgloader 3.3.2. Building the latest pgloader from source didn't work on Debian either; it's based on Common Lisp, and the older CL on Debian dumped me into some kind of assembly debugger when I tried to build pgloader. Rather than build CL from source too, I looked for another option.

On an Older OS: Use pgloader Remotely

Postgresql can take commands from remote machines. So you can configure postgresql to accept remote connections, then run the migration from a machine with a new enough pgloader version.

There are two files to edit. The location of postgresql's configuration directory varies with version, so do a locate pg_hba.conf to find it. In that directory, first edit pg_hba.conf and add these lines to the end to allow net socket connections from IP4 and IP6:

host  all     all   0.0.0.0/0     md5
host  all     all   ::/0          md5

In the same directory, edit postgresql.conf and search for listen_addr. Comment out the localhost line if it's uncommented, and add this to allow connections from anywhere, not just localhost:

listen_addresses = '*'

Then restart the database with

service postgresql restart

Modify the migrate.pgloader file from the previous section so the "into" line looks like

    into postgresql://username:password@host/dbname
The username there is the postgres username, if you made that different from the Unix username. You need to use a password because postgres is no longer using peer auth (see that postgres documentation file I linked earlier).

Assuming this You're done with the remote connection part. If you don't need remote database connections for your app, you can now edit postgresql.conf, comment out that listen_addresses = '*' line, and restart the database again with service postgresql restart. Don't remove the two lines you added in pg_hba.conf; flask apparently needs them.

You're ready for the migration. Make sure you have the latest copy of the server's sqlite database, then, from your desktop, run:

pgloader migrate.pgloader

Migrate Autoincrements to Sequences

But that's not enough. If you're using any integer primary keys that autoincrement -- a pretty common database model -- postgresql doesn't understand that. Instead, it has sequence objects. You need to define a sequence, tie it to a table, and tell postgresql that when it adds a new object to the table, the default value of id is the maximum number in the corresponding sequence. Here's how to do that for the table named "user":

CREATE SEQUENCE user_id_seq OWNED by "user".id;
ALTER TABLE "user" ALTER COLUMN id SET default nextval('user_id_seq');
SELECT setval(pg_get_serial_sequence('user', 'id'), coalesce(max(id)+1,1), false) FROM "user";

Note the quotes around "user" because otherwise user is a postgresql reserved word. Repeat these three lines for every table in your database, except that you don't need the quotes around any table name except user.

Incidentally, I've been told that using autoincrement/sequence primary keys isn't best practice, because it can be a bottleneck if lots of different objects are being created at once. I used it because all the models I was following when I started with flask worked that way, but eventually I plan try to switch to using some other unique primary key.

Update: Turns out there was another problem with the sequences, and it was pretty annoying. I ended up with a bunch of indices with names like "idx_15517_ix_user_email" when they should have been "ix_user_email". The database superficially worked fine, but it havoc ensues if you ever need to do a flask/sqlalchemy/alembic migration, since sqlalchemy doesn't know anything about those indices with the funny numeric names. It's apparently possible to rename indices in postgresql, but it's a tricky operation that has to be done by hand for each index.

Now the database should be ready to test.

Test

Your flask app probably has something like this in config.py:

    SQLALCHEMY_DATABASE_URI = os.environ.get('DATABASE_URL') or \
        'sqlite:///' + os.path.join(basedir, 'dbname.db')

If so, you can export DATABSE_URL=postgresql:///dbname and then test it as you usually would. If you normally test on a local machine and not on the server, remember you can tell flask's test server to accept connections from remote machines with flask run --host=0.0.0.0

Database Backups

You're backing up your database, right? That's easier in sqlite where you can just copy the db file.

From the command line, you can back up a postgresql database with: pg_dump dbname > dbname-backup.pg You can do that from Python in a subprocess:

    with open(backup_file, 'w') as fp:
        subprocess.call(["pg_dump", dbname], stdout=fp)

Verify You're Using The New Database

I had some problems with that DATABASE_URL setting; I'd never used it so I didn't realize that it wasn't in the right place and didn't actually work. So I ran through my migration steps, changed DATABASE_URL, thought I was done, and realized later that the app was still running off sqlite3.

It's better to know for sure what your app is running. For instance, you can add a route to routes.py that prints details like that.

You can print app.config["SQLALCHEMY_DATABASE_URI"]. That's enough in theory, but I wanted to know for sure. Turns out str(db.session.get_bind()) will print the connection the flask app's database is actually using. So I added a route that prints both, plus some other information about the running app.

Whew! I was a bit surprised that migrating was as tricky as it was, and that there wasn't more documentation for it. Happy migrations, everyone.

Tags: , , , , ,
[ 12:34 Feb 01, 2020    More tech/web | permalink to this entry | ]