[llvm-dev] Clarification on expectations of buildbot email notifications

Wed Feb 20 01:32:00 PST 2019

I think we could/should be a little bit more precise here:

> ... any regressions whether they affect buildbots or not, the
> patch author should be responsible for fixing the issue.

especially if we say that the bar for a revert is low. That is, the "any regression" needs a bit more clarifications. Assuming we are talking about performance regressions (not language conformance or correctness):

1) We sometimes see regressions where code generation is (almost) the same, but the code layout is different. Some micro-architectures are more sensitive to this than others, causing significant regressions. We always thought it was unfair to ask for a revert for these kind of regressions, and thus never ask for that.

2) We also sometimes see that patches that cause regressions actually do the right thing, but have all sorts of knock on effects e.g. causing different codegen and regressions. Sometimes this is just unlucky (e.g. regalloc making different decisions), but sometimes other passes can't handle the IR or machine code less efficient and something need to be actually fixed. But we also very rarely ask for a revert in these cases.

3) The obvious and straightforward case is when a patch is not doing the right thing or e.g. forgets certain cases. Usually what we do is leave a comment on the Phab ticket, and when the author responds fast and works on a fix we can live with the regression for a few days (but it looks like we could be a bit more aggressive with reverts if we wanted to).

The straightforward cases are 1) and 3), where the former is not worth a revert (but it would be good to be explicit about this), and 3) is definitely worth a revert.

2) is the tricky one, because it has a lot of grey areas. I guess the reason why we are not very aggressive with reverts is that we don't want to stop others from making progress, and also thought that in some cases it was just our problem and not the author's. In the example of knock on effects and some heuristic making a different/wrong decision, I thought it was unfair to the author to ask for a revert. A more aggressive revert policy here could easily lead to people not making any progress or a lot less fast.

Cheers,
Sjoerd.

________________________________
From: llvm-dev <llvm-dev-bounces at lists.llvm.org> on behalf of Reid Kleckner via llvm-dev <llvm-dev at lists.llvm.org>
Sent: 19 February 2019 23:29
To: Zachary Turner
Cc: llvm-dev
Subject: Re: [llvm-dev] Clarification on expectations of buildbot email notifications

I don't think whether a buildbot sends email should have anything to do with whether we revert to green or not. Very often, developers commit patches that cause regressions not caught by our buildbots. If the regression is severe enough, then I think community members have the right, and perhaps responsibility, to revert the change that caused it. Our team maintains bots that build chrome with trunk versions of clang, and we identify many regressions this way and end up doing many reverts as a result. I think it's important to continue this practice so that we don't let multiple regressions pile up.

I think what's important, and what we should, after this discussion concludes, put in the developer policy, is that the person doing the revert has the responsibility to do their best to help the patch author reproduce the problem or at least understand the bug.

This can take many forms. They can link directly to an LLVM buildbot, which should be self-explanatory as far as reproduction goes. It can be an unreduced crash report. If they're nice, they can use CReduce to make it smaller. But, a reverter can't just say "Revert rNNN, breaks $RANDOM_PROJECT on x86_64-linux-gu". If they add, "reduction forthcoming" and they deliver on that promise, I think we should support that.

In other words, the bar to revert should be low, so we can do it fast and save downstream consumers time and effort. If someone isn't making a good faith effort to follow up after a revert, then authors have a right to push back.

I agree with Paul that we should remove the text about checking nightly builders. That suggestion seems a bit dated.

On Tue, Feb 19, 2019 at 11:22 AM Zachary Turner via llvm-dev <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>> wrote:
Hi all,

Over the past year or so, all of us have broken the buildbots on many occasions.  Usually we get notified on IRC, or via an buildbot email notification sent to everyone on the blamelist.
If I happen to be on IRC I'll see the notification, but if not, the next best thing is an email that was automatically sent to me (along with everyone else on the blamelist) from the buildbot with information about the failure.
And then finally, I'll occasionally get a response to my commit message telling me that it's broken, and the patch may be reverted with information in the commit message explaining which bot was broken and providing a link to it.

However, we have some buildbots on the public waterfall which are specifically configured not to send emails to people.  In some cases it's because the bots are experimental, but there are a handful where the reasoning I've been given is that it "wastes peoples time and contributes to build blindness", but we are still expected to keep them green (usually by people manually reaching out to us when they fail, or patches getting reverted and us getting notified of the revert).

It is this last case that I'm concerned about, as it appears to be in direct conflict with our own developer policy [https://llvm.org/docs/DeveloperPolicy.html#id14], which states this
-----
We prefer for this to be handled before submission but understand that it isn’t possible to test all of this for every submission. Our build bots and nightly testing infrastructure normally finds these problems. A good rule of thumb is to check the nightly testers for regressions the day after your change. Build bots will directly email you if a group of commits that included yours caused a failure. You are expected to check the build bot messages to see if they are your fault and, if so, fix the breakage.

Commits that violate these quality standards (e.g. are very broken) may be reverted. This is necessary when the change blocks other developers from making progress. The developer is welcome to re-commit the change after the problem has been fixed.

-----

I'm sending this email to get a sense of the community's views on this matter.  If I'm correctly reading between the lines in the above passage, buildbots which do not send emails should not be subject to the revert-to-green policy.  To be honest, it's actually not even clear from reading the above passage where the burden of fixing a "broken" patch on a silent buildbot lies at all - with the patch author or with the bot maintainer.

Would anyone care to weigh in with an unbiased opinion here?

_______________________________________________
LLVM Developers mailing list
llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190220/0dc0d962/attachment.html>