U is for Unreliable UI (or: Why Firefox's "Do this automatically this from now on" checkbox is so flaky, and how to work around it)
It's been a frustration with Firefox for years. You click on a link and get the "What should Firefox do with this file?" dialog, even though it's a file type you view all the time -- PDF, say, or JPEG. You click "View in browser" or "Save file" or whatever ... then you check the "Do this automatically for files like this from now on" checkbox, thinking, I'm sure I checked this last time.
Then a few minutes later, you go to a file of the exact same time, and you get the dialog again. That damn checkbox is like the button on street crossings or elevators: a no-op to make you think you're doing something.
I never tried to get to the bottom of why this happens with some PDFs and not others, some JPGs but not others. But Los Alamos puts their government meetings on a site called Legistar. Legistar does everything as PDF -- and those PDFs all trigger this Firefox bug, prompting for a download rather than displaying in Firefox's PDF viewer.
(Aside: another current project of mine is scraping those PDFs so I can have an RSS feed of public meetings, alerting me when new meeting agendas are posted. It's mostly working but not quite ready for public consumption yet.)
Anyway, because of Legistar, I finally got motivated to look into the problem, and since I now had an easily reproducible case, I filed a bug.
I got lucky. Bugzilla user :Gijs noticed the bug and promptly DUPed it to Bug 453455: Make "do this automatically for files like this from now on" work even with "content-disposition: attachment" with a clear explanation of the real problem (thanks!)
What is content-disposition: attachment" for?
The bug was interesting reading. I was previously unfamiliar with
content-disposition: attachment
.
It was apparently invented for email (as was MIME itself). If someone sends you an email and attaches mycar.jpg, in the MIME attached to the email, they can specify something like
Content-Disposition: attachment; filename=mycar.jpgand then your mailer knows to use that filename when it saves the attachment. Useful.
The problem arises when it's used on a website.
Browsers use Content-Disposition: attachment
as a declaration
that a file should always be saved, never displayed inline, even if
it's some format the browser can handle perfectly well, like PDF.
Except Firefox modifies that a bit: you can sometimes view it or
open it in an external viewer,
only in that case, the "Do this every time" checkbox is ignored.
If you hit a link that's doing this to you, you can check the content-disposition of a URL with either of these:
curl --head [URL] wget -S --spider [URL]
I don't know of a way to get that information within Firefox: you can't use "View Page Info" because you can't view the page at all, because of the content-disposition.
If you read through the interminable bug 453455 discussion, you'll see some hand-waving about security issues. What these security issues are is still not clear to me even after reading the whole discussion: if opening a PDF or JPG in Firefox is a security risk, then I'd hope that Firefox would fix that regardless of what content-disposition the website owner might have put on the file. And Firefox's dialog lets you view "attachment" PDFs in Firefox anyway; it just doesn't let you save that setting so Firefox will do the same thing next time you click on a PDF.
There are smart people commenting in the bug, and I'm sure that there's some subtle edge case, probably involving something like a .EXE on Windows, where you might not want to have people opening files automatically. But I don't see how that extends to refusing to let people view JPGs or PDFs in the browser, and reading the bug discussion didn't make it any clearer.
As far as I can tell, it's just a stupid bug that Mozilla has refused to fix for twelve years, and may never fix. They don't seem to be bothered that having a "remember this" checkbox that doesn't remember makes Firefox seem buggy and unreliable (not to mention frustrating to use). Decisions like that are part of why Firefox's user numbers keep plummeting.
Working Around the Bug: How Do You View Those Files?
There are several extensions intended to fix the content-disposition bug, like InlineDisposition Reloaded, Display inline, and InlineDisposition (WebExtensions). None of them have been security reviewed: they're probably fine, but there's no way to be sure without reading the code yourself. And I've gotten leery of installing too many Firefox extensions: Mozilla changes their extension API so often that extensions frequently become obsolete and need to be updated, and chasing new extensions can be a never-ending battle.
Privoxy
There's another option: one bug comment mentioned Privoxy, a highly configurable web proxy meant for enhancing privacy and working around bugs like this. It's sort of like the Pi-hole, except that it runs locally on your Linux machine rather than requiring that you set up a separate Raspberry Pi to act as your proxy. Just set Firefox to use Privoxy following the instructions from the Privoxy Quickstart.
Configuring Privoxy was a bit confusing -- they have a lot of documentation and examples, which is great, but the quickstart is a little too basic, and figuring anything else out requires a lot of reading. It took a while to figure out whether I needed a filter or an action. The Actions page had an example of a built-in action called hide-content-disposition that does exactly what I wanted -- but the documentation warned "This action will probably be removed in the future, use server-header filters instead" without giving a hint on how to do this with server-header filters.
I tried the old, deprecated method first anyway, which worked fine when I added it to /etc/privoxy/user.action
{ -filter \ +hide-content-disposition{block} } .legistar.com/View\.ashxand then restarted privoxy with
sudo systemctl restart privoxy.service
.
And voila, I could click on one of those PDFs and it opened in
a new Firefox tab.
But I wanted to do it the right way, not the deprecated way, and there was no example for how to do that. I struggled with the documentation at first. It took me a while to figure out that filters are for defining rules, which won't be applied until you make an action that specifies a named filter plus a list of domain patterns.
Once I realized that, I could read default.filters and default.rules and the examples started making sense. It looked like there was already a default filter to do what I was after, called less-download-windows. But it turned out that although less-download-windows covers several file types, it doesn't include PDF.
So I had to write a new filter. Filter rules start with the type of filter, then have as many lines as you want to do substitutions in the headers sent from the server. I copied less-download-windows, changed the file type, and removed second substitution that changed the filename since I didn't see a need for that part, leaving me with (this goes in user.filter):
SERVER-HEADER-FILTER: pdf-less-download-windows Like less-download-windows but for PDF s@^Content-Disposition:.*filename=(["']?).*\.pdf\1.*$@@i
Then I needed an action to specify when to use that filter. Action rules start with a list of actions you want to apply, space separated inside curly braces: I only had one action, the pdf-less-download-windows server-header-filter I'd just defined. After that is just a list of domains where you want the rule to apply. There's no obvious begin or end to an action section: I guess anything that doesn't include an open curly brace is fair game. This went in user.action:
{ +server-header-filter{pdf-less-download-windows} } losalamos.legistar.com/View\.ashx
When debugging rules like this, I recommend editing /etc/privoxy/config and turning on debugging. The comments in that file tell you all the different cases you can debug, but I used these:
debug 1 # Log the destination for each request Privoxy let through. See also debug 1024. debug 1024 # Log the destination for requests Privoxy didn't let through, and the reason why. debug 4096 # Startup banner and warnings debug 8192 # Non-fatal errors
Then you can tail -f /var/log/privoxy/logfile
and see what
actions are being applied. When you're happy with your rules, comment
out the 8192 rule -- but you might want to leave the others in.
Snooping on Web Requests
Tailing the log file with debugging on was quite instructive, even aside from debugging my filter rules.
For instance, it's amazing how many URLs firefox visits when I start it up, or when it's just sitting there seemingly doing nothing. It has a portal detection page to make sure there's a network when it starts up (a good thing), but why does it keep going back to detectportal.firefox.com/success.txt over and over when it worked the first time? What are snippets.cdn.mozilla.net:443/ and push.services.mozilla.com:443/? And why does it keep going to spocs.getpocket.com:443/ and getpocket.cdn.mozilla.net:443/ when I supposedly disabled pocket (extensions.pocket.enabled is False in about:config, which is what Mozilla claims is how to disable Pocket)?
This article is already running long, so I won't go into the details now, but two useful references are Mozilla's How to stop Firefox from making automatic connections and an older, but still useful, article, Silencing Firefox’s Chattiness for Web App Testing.
Meanwhile, why do Google search pages keep periodically checking URLs like play.google.com:443/ whenever I have a Google search tab open, even the tab isn't visible? Yet another reason to use DuckDuckGo.
It looks like Privoxy is going to show me a lot of interesting activity I didn't know about on my system. It's also able to block ads, social network trackers like Facebook, animated GIFs and lots of other annoyances. I'm already blocking some of those in Firefox, but privoxy might be a better way. This was definitely an experiment worth running and I'm looking forward to seeing what else it can tell me.
[ 16:38 Aug 08, 2020 More tech/web | permalink to this entry | ]