[llvm-dev] [cfe-dev] Buildbot Noise

Fri Oct 16 11:10:27 PDT 2015

> -----Original Message-----
> From: Renato Golin [mailto:renato.golin at linaro.org]
> Sent: Friday, October 16, 2015 7:29 AM
> To: Robinson, Paul
> Cc: David Blaikie; LLVM Dev; Galina Kistanova
> Subject: Re: [cfe-dev] Buildbot Noise
> 
> On 16 October 2015 at 15:17, Robinson, Paul
> <Paul_Robinson at playstation.sony.com> wrote:
> > But if
> > there are new fails, the blame mailer can do a set-difference and report
> > only the new ones. That would reduce the noise a bit, hmm?
> 
> Hi Paul,
> 
> The danger there is that it'd be easier to "get used" to having some
> failures as long as you don't have "new" failures. Every place I
> worked that supported that philosophy, ended up with all bots
> "orange". It's never the intention, but it's almost always the
> inevitable consequence. In a small team, or a single company, it may
> be a lot easier to move them back to green, but in an open community,
> it's not that easy, nor that quick.
> 
> The way we work with the same concept, as David mentioned repeatedly,
> is to use XFAILs. It is essentially the same thing, except that "it
> hurts more" to mark an XFAIL than to see a different shade of red, so
> we're more reluctant to ignore them.

Hmmm which is manually telling the bot not to worry about it any more,
rather than automatically having the bot figure it out.
But from "outside" the bot now looks green, which is less embarrassing
to the perpetrator, if embarrassment is part of the social process of
getting people to fix things.  That part is cultural.
And an XFAIL for "test will never work in this configuration" is not
obviously different from "somebody broke this test and ought to fix it"
from the outside either; on my x86_64 Linux I get 148 XFAILs. Which
ones are "will never work" and which are "ought to get fixed soon"?

For the people who now get impatient with red bots, there is no barrier
to having them be impatient with orange bots.   And for the people who
complain about irrelevant noise, there would be less of it, which will
make them less irritating the rest of us. :-)

> 
> Plus, an orange bot that becomes red (new failures) will itself become
> orange as time passes, or new failures show up. If we end up with that
> many shades of red, understanding the difference will become harder,
> and the value will decrease.

My internal dashboard has 4 values that would be meaningful upstream:
pass, infra-failure, fail, no-new-fail. Buildbot seems to have the first 
three already; one more does not seem like an excessive conceptual burden.
--paulr

> 
> cheers,
> --renato