Shallow Thoughts

Akkana's Musings on Open Source, Science, and Nature.

Tue, 29 May 2007

A Culture of Regressions (or, Why I no longer work on Mozilla)

A couple of friends periodically pester me to write about why I stopped contributing to Mozilla after so many years with the project. I've held back, feeling like it's airing dirty laundry in public.

But a discussion on over the last week, started by Nelson Bolyard, aired it for me: it was their culture of regressions.

I love Mozilla technology. I'm glad it exists, and I still use it for my everyday browsing. But trying to contribute to Mozilla just got too frustrating. I spent more time chasing down and trying to fix other people's breakages than I did working on anything I wanted to work on.

That might be okay, barely, when you're getting paid for it. But when you're volunteering your own time, who wants to spend it fixing code checked in by some other programmer who just can't be bothered to clean up his own mess?

It's the difference between spending a day cleaning your own house ... and spending every day cleaning other people's houses.

Nelson said it eloquently in this exchange:

(Robert Kaiser writes)
As we are open source, everyone can access and test that code, and find and file the regressions, so that they get fixed over time.

(Boris Zbarsky writes)
That last conclusion doesn't necessarily follow. To get them fixed you need someone fixing them.

(Nelson Bolyard writes)
We're very unlikely to get volunteers to spent large amounts of effort, rewriting formerly working code to get it to work again, after it was broken by someone else's checkin. This demotivates developers and drives them away. They think "why should I keep working on this when others can break my code and I must pay for their mistakes?" and "I worked hard to get that working, and now person X has broken it. Let HIM fix it."

This was exactly how I felt, and it's the reason I quit working on Mozilla.

A little later in the thread, Boris Zbarsky reports that the trunk has been so broken with regressions that it's been unusable for him for weeks or months. (When you have someone as committed and sharp as Boris unable to use your software, you know there's something wrong with your project's culture.) He writes: "For example, on my machine (Linux) about one in three SVG testcases in Bugzilla causes trunk Gecko to hang X ..."

Justin Dolske replies, "Oh, Linux," and asks if it's related to turning on Cairo. Boris replies affirmatively. Just another example where a change was checked in that caused serious regressions keeping at least one important contributor from using the browser on a regular basis; yet it's still there and hasn't been backed out. Of course, it's "only Linux".

David Baron appears to take Nelson's concerns seriously, and suggests criteria for closing the tree and making everyone stop work to track down regressions. As he correctly comments, closing the tree is very serious and inefficient, and should be avoided in all but the most serious cases.

But Nelson repeats the real question:

(Nelson Bolyard writes)
Under what circumstances does a Sheriff back out a patch due to functional regressions? From what you wrote above, I gather it's "never". :(

Alas, the thread peters out after that; there's no reply to Nelson's question.

The problem with Mozilla isn't that there are regressions. Mistakes happen. The problem is that regressions never get fixed, because the project's culture encourages regressions. The prevailing attitude is that it's okay to check in changes that break other people's features, as long as your new feature is cool enough or the right people want it. If you break something, well, hey, someone will figure out a fix eventually. Or not. Either way, it's not your problem.

Working on new features is fun, and so is getting the credit for being the one to check them in. Fixing bugs, writing API documentation, extensive testing -- these things aren't fun, they're hard work, and there isn't much glory in them either (you don't get much appreciation or credit for it). So why do them if you don't have to? Let someone else worry about it, as long as the project lets you get away with it!

A project with a culture of responsibility would say that the person who broke something should fix it, and that broken stuff should stay out of the tree. If programmers don't do that themselves just because it's the right thing to do, the project could enforce it: just insist that regression-causing changes that can't be fixed right away be backed out. Fix the regressions out of the tree where they aren't causing problems for other people. Get help from people to test it and to integrate it with those other modules you forgot about the first time around.

Yes, even if it's a change that's needed -- even if it's something a lot of people want. If it's a good change, there will always be time to check it in later.

When it's really working.

[ 10:07 May 29, 2007    More programming | permalink to this entry ]