[llvm-dev] buildbot failure in LLVM on clang-cmake-thumbv7-a15-full-sh

Renato Golin via llvm-dev llvm-dev at lists.llvm.org
Tue Sep 29 10:56:18 PDT 2015


On 29 September 2015 at 18:41, David Blaikie <dblaikie at gmail.com> wrote:
> Is it? While it's failing, the buildbot doesn't seem to be any use to the
> community at large - it's essentially the buildbot owners problem at that
> point and probably shouldn't be engaging with the community until it's green
> again, I think?

The bot is useful as it still shows if there are new bugs since the
initial problem, and can help bisect any further problem when they
come. If we disable that bot, when we fix the issue and bring it back,
there could be a number of new failures that we didn't monitor and
that will need a few more days/weeks to remove, especially if they're
cumulative. This way, it's likely that we'll never have that bot
online ever again. This is bad for the community.


> Is the buildbot useful to you during this time? Or are you debugging
> elsewhere/privately?

Both. As I described above, this bot is useful not just to me, but the
community, as they can cross check if their commits introduced bugs to
all ARM bots, not just one, and the slow bot will show that. I'm also
investigating elsewhere, since if I turn this bot off, what I said
above will happen. I'm also not alone in investigating this, Saleem is
helping me.


> If the buildbot is useful to you, but not the community at large - perhaps
> we could get in the habit of moving it into a "no email" pool whenever a
> failure occurs, until it can be cleared up. (hopefully this pool is clearly
> distinguished from the rest of the buildbots in the waterfall/grid view -
> because it'd be helpful to be able to look at an easily distinguished subset
> of the waterfall/grid and see the bots that are expected to be green for any
> developer there)

Any movement means restarting the buildmaster, which means stopping
all current builds and upsetting all other bots. If we start taking
the stance of moving things up and down the priority list, we'll have
more unstable buildbots and that's worse for the community. Our
agreement, at least from what I understood, was that we should move
unstable bots to offline if: they're broken for a while AND no one is
trying to or can fix it. "A while" is vague because it depends on the
hardware, and I'm definitely trying to fix it.

It's not because the hardware is slow that it has no value to the
community, unless you're arguing that we shouldn't test ARM at all,
which is a whole different story.

Not emailing bugs in this bot when it's green means it's probably
useless, so I wouldn't want to have any bots in there. I already have
a separate buildmaster which doesn't email where I test my prototypes,
but those are work in progress, while my production bots are not.

A neater solution would be to not email *any* buildbot that moves from
exception to failure if the previous non-exceptional status is also
failure. This way, we won't have the kind of email that upset you, but
we still have the value that a red bot provides.

cheers,
--renato


More information about the llvm-dev mailing list