Shallow Thoughts

Akkana's Musings on Open Source, Science, and Nature.

Sat, 20 Oct 2007

Firefox, caching, and fast Back/Forward buttons

I remember a few years ago the Mozilla folks were making a lot of noise about the "blazingly fast Back/Forward" that was coming up in the (then) next version of Firefox. The idea was that the layout engine was going to remember how the page was laid out (technically, there would be a "frame cache" as opposed to the normal cache which only remembers the HTML of the page). So when you click the Back button, Firefox would remember everything it knew about that page -- it wouldn't have to parse the HTML again or figure out how to lay out all those tables and images, it would just instantly display what the page looked like last time.

Time passed ... and Back/Forward didn't get faster. In fact, they got a lot slower. The "Blazingly Fast Back" code did get checked in (here's how to enable it) but somehow it never seemed to make any difference.

The problem, it turns out, is that the landing of bug 101832 added code to respect a couple of HTTP Cache-Control header settings, no-store and no-cache. There's also a third cache control header, must-revalidate, which is similar (the difference among the three settings is fairly subtle, and Firefox seems to treat them pretty much the same way).

Translated, that means that web servers, when they send you a page, can send some information along with the page that asks the browser "Please don't keep a local copy of this page -- any time you want it again, go back to the web and get a new copy."

There are pages for which this makes sense. Consider a secure bank site. You log in, you do your banking, you view your balance and other details, you log out and go to lunch ... then someone else comes by and clicks Back on your browser and can now see all those bank pages you were just viewing. That's why banks like to set no-cache headers.

But those are secure pages (https, not http). There are probably reasons for some non-secure pages to use no-cache or no-store ... um ... I can't think of any offhand, but I'm sure there are some.

But for most pages it's just silly. If I click Back, why wouldn't I want to go back to the exact same page I was just looking at? Why would I want to wait for it to reload everything from the server?

The problem is that modern Content Management Systems (CMSes) almost always set one or more of these headers. Consider the Linux.conf.au site. Linx.conf.au is one of the most clueful, geeky conferences around. Yet the software running their site sets

  Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
  Pragma: no-cache
on every page. I'm sure this isn't intentional -- it makes no sense for a bunch of basically static pages showing information about a conference several months away. Drupal, the CMS used by LinuxChix sets Cache-Control: must-revalidate -- again, pointless. All it does is make you afraid to click on links because then if you want to go Back it'll take forever. (I asked some Drupal folks about this and they said it could be changed with drupal_set_header).

(By the way, you can check the http headers on any page with: wget -S -O /dev/null http://... or, if you have curl, curl --head http://...)

Here's an excellent summary of the options in an Opera developer's blog, explaining why the way Firefox handle caching is not only unfriendly to the user, but also wrong according to the specs. (Darn it, reading sensible articles like that make me wish I wasn't so deeply invested in Mozilla technology -- Opera cares so much more about the user experience.)

But, short of a switch to Opera, how could I fix it on my end? Google wasn't any help, but I figured that this must be a reported Mozilla bug, so I turned to Bugzilla and found quite a lot. Here's the scoop. First, the code to respect the cache settings (slowing down Back/Forward) was apparently added in response to bug 101832. People quickly noticed the performance problem, and filed 112564. (This was back in late 2001.) There was a long debate, but in the end, a fix was checked in which allowed no-cache http (non-secure) sites to cache and get a fast Back/Forward. This didn't help no-store and must-revalidate sites, which were still just as slow as ever.

Then a few months later, bug 135289 changed this code around quite a bit. I'm still getting my head around the code involved in the two bugs, but I think this update didn't change the basic rules governing which pages get revalidated.

(Warning: geekage alert for next two paragraphs. Use this fix at your own risk, etc.)

Unfortunately, it looks like the only way to fix this is in the C++ code. For folks not afraid of building Firefox, the code lives in nsDocShell::ShouldDiscardLayoutState and controls the no-cache and no-store directives. In nsDocShell::ShouldDiscardLayoutState (currently lie 8224, but don't count on it), the final line is:

    return (noStore || (noCache && securityInfo));
Change that to
    return ((noStore || noCache) && securityInfo);
and Back/Forward will get instantly faster, while still preserving security for https. (If you don't care about that security issue and want pages to cache no matter what, just replace the whole function with return PR_FALSE; )

The must-validate setting is handled in a completely different place, in nsHttpChannel. However, for some reason, fixing nsDocShell also fixes Drupal pages which set only must-validate. I don't quite understand why yet. More study required. (End geekage.)

Any Mozilla folks are welcome to tell me why I shouldn't be doing this, or if there's a better way (especially if it's possible in a way that would work from an extension or preference). I'd also be interested in from Drupal or other CMS folks defending why so many CMSes destroy the user experience like this. But please first read the Opera article referenced above, so that you understand why I and so many other users have complained about it. I'm happy to share any comments I receive (let me know if you want your comments to be public or not).

Tags: , , , ,
[ 19:32 Oct 20, 2007    More tech/web | permalink to this entry ]