[cfe-dev] Can we remove llvmbb from IRC?

Nico Weber via cfe-dev cfe-dev at lists.llvm.org
Wed Sep 2 06:27:53 PDT 2020


On Tue, Sep 1, 2020 at 9:13 PM David Blaikie <dblaikie at gmail.com> wrote:

> I assume you're getting emails in addition to the chat spam? Or are you
> not/are these bots sending chat spam but not email? If that's the case,
> yeah, I'd rather have a consistent notification experience - and disable
> all notifications from a bot if some notifications are disabled (eg: if
> it's not good enough to be sending email, then it shouldn't be spamming the
> IRC channel either)
>

I received a single email for the greendragon bot. The rest was IRC only.
(The greendragon bot didn't send an IRC ping I think.)


>
> On Tue, Sep 1, 2020 at 1:20 PM Nico Weber <thakis at chromium.org> wrote:
>
>> On Tue, Sep 1, 2020 at 3:57 PM David Blaikie <dblaikie at gmail.com> wrote:
>>
>>>
>>>
>>> On Tue, Sep 1, 2020 at 12:42 PM Nico Weber <thakis at chromium.org> wrote:
>>>
>>>> On Tue, Sep 1, 2020 at 3:32 PM David Blaikie <dblaikie at gmail.com>
>>>> wrote:
>>>>
>>>>> On Tue, Sep 1, 2020 at 12:07 PM Nico Weber via cfe-dev <
>>>>> cfe-dev at lists.llvm.org> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> llvmbb's job is to inform people of build breaks. However, it seems
>>>>>> to trigger for a big list of bots, and at least one of them seems to always
>>>>>> be broken,
>>>>>>
>>>>>
>>>>> If a bot is always broken it shouldn't be sending email/notifications
>>>>> - generally they are configured only to send email on green>red and
>>>>> red>green transitions, so if it's already broken you shouldn't be blamed
>>>>> for it. If you are seeing bot spam or emails from a bot that's already red,
>>>>> please email llvm-dev and the bot maintainer and ask the bot to be
>>>>> reconfigured or disabled.
>>>>>
>>>>> If a bot is regularly flakey (& thus sending email/notifications that
>>>>> are false-positives/that no one can act on) please also send email asking
>>>>> for the bot to be reconfigured or disabled. (or, if you want to be a bit
>>>>> more punchy - send a patch to the zorg repository to have the bot disabled
>>>>> & explain why you're proposing that)
>>>>>
>>>>
>>>> I agree with this in the abstract, but I get pinged completely reliably
>>>> at least twice after every single of my commits. This isn't something that
>>>> sometimes happens, it's something that always happens.
>>>>
>>>
>>> Could you point to specific buildbots/email when that comes up to help
>>> improve things both on IRC and email/mailing lists, etc?
>>>
>>
>> Just land a change :) Or look at IRC scrollback. Given how easy it is to
>> find these problems, it doesn't seem like there's a lot of appetite for
>> improving this.
>>
>
> I think there's apetite for changing it in some way - no one enjoys the
> current state of things. But often people assume it's not changeable,
> whereas I think it is - and I think it's important that it be changed
> because if we silence all the bots, then quality is likely to go down.
> Silencing the IRC bot may still be good - folks should be getting buildbot
> fail email which is more targeted and not spamming the channel for people
> who aren't to blame (heck, the bots could send private messages instead, I
> guess?).
>
> But improving signal/noise should benefit the email, and the bot spam
> (whichever channel it's in).
>
>
>> Hence me asking about removing llvmbb (...and so far everyone seems to be
>> in favor).
>>
>
>> In this case, from my IRC scrollback (there's more people on the
>> blamelist, spread over several follow-on IRC messages):
>>
>> build #13975 of clang-ppc64le-linux-multistage is complete: Failure
>> [failed ninja check 1]  Build details are at
>> http://lab.llvm.org:8011/builders/clang-ppc64le-linux-multistage/builds/13975
>>  blamelist: LLVM GN Syncbot <llvmgnsyncbot at gmail.com>, Nico Weber <
>> thakis at chromium.org>
>>
>
> That doesn't look like the "always be broken" case. It was green on the
> build prior to this one (
> http://lab.llvm.org:8011/builders/clang-ppc64le-linux-multistage/builds/13974
>  )
>
> Looks like the buildbot triggered correctly, only took the 2 revisions you
> committed. The test did pass at the prior revision and did fail at that
> revision - perhaps either the buildbot or the test is flakey?
> (interestingly the test failed in stage 1 at 13975, then failed in stage 2
> at 13976 - then passed again in 13977. Both failures for the same reason
> "/home/buildbots/ppc64le-clang-multistage-test/clang-ppc64le-multistage/stage2/tools/clang/test/Driver/Output/target-override.c.script:
> line 5:
> /home/buildbots/ppc64le-clang-multistage-test/clang-ppc64le-multistage/stage2/tools/clang/test/Driver/Output/testbin/i386-clang:
> No such file or directory" - perhaps some problem with creating the symlink?
>
> Started an llvm-dev thread to discuss that separately in more detail.
>
>
>> build #24132 of clang-with-thin-lto-ubuntu is complete: Failure [failed
>> test-stage1-compiler]  Build details are at
>> http://lab.llvm.org:8011/builders/clang-with-thin-lto-ubuntu/builds/24132
>>  blamelist: Nico Weber <thakis at chromium.org>, Matt Arsenault <
>> Matthew.Arsenault at amd.com>, Eric Astor <epastor at google.com>, Craig
>> Topper <craig.topper at intel.com>, Alina
>>
>
> Also green on the prior build (
> http://lab.llvm.org:8011/builders/clang-with-thin-lto-ubuntu/builds/24131
>  ).
> Went green again after a revert here:
> http://lab.llvm.org:8011/builders/clang-with-thin-lto-ubuntu/builds/24140 which
> matches the commit that made the bot go red - so this looks to be a bot
> doing what it's meant to do. (varying levels of quality, and 2 hour cycle
> time isn't ideal by any means, though it found this failure in 5 minutes
> once it started (but that could be 2 hours after a commit))
>
> What do you think we should do with bots like this? Should long cycle
> time/long blame list bots (not always the same thing) produce no
> notifications, and require them to be triaged by the bot owner who then
> manually sends email/follow-up once a rough guess of blame has been made &
> checked that it hasn't already been possibly diagnosed, discussed and fixed
> due to a faster bot or other means?
>

My personal opinion is that we shouldn't have any bots that take more than
an hour to cycle send any notifications.


>
>
>> build #2255 of lld-x86_64-win is complete: Failure [failed
>> test-check-all]  Build details are at
>> http://lab.llvm.org:8011/builders/lld-x86_64-win/builds/2255  blamelist:
>> LLVM GN Syncbot <llvmgnsyncbot at gmail.com>, Eric Astor <epastor at google.com>,
>> Craig Topper <craig.topper at intel.com>, Alina Sbirlea <asbirlea at google.com>,
>> Nico Weber <thakis at chromium.org>, Amara
>>
>
> Also green on the prior build (
> http://lab.llvm.org:8011/builders/lld-x86_64-win/builds/2254 ), and went
> back to green on the following build.
> Possibly this was related to the same commit/revert as in the previous bot
> in this list. It's a fairly fast bot, went red on a build including the
> revision that committed the xor issue, and green on the next build that
> included a revert of that patch. I couldn't say for sure, though.
>
> I also got email with pointers to:
>>
>> http://green.lab.llvm.org/green//job/clang-stage1-RA/14180/consoleFull#-1417328700a1ca8a51-895e-46c6-af87-ce24fa4cd561
>>
>
> Was red for a few builds then green again here:
> http://green.lab.llvm.org/green/job/clang-stage1-RA/14183/
>
> Looks like the build that went red and the build that went green (& the
> fact that the failure was related to libfuzzer) correlates well with this
> commit:
> https://github.com/llvm/llvm-project/commit/2665425908e00618074e42155ec922a37f7c9002 and
> this revert:
> https://github.com/llvm/llvm-project/commit/7139736261e047e9cca030e2ee5912bf2a16f816
>
>
>> Chances are that there's something genuinely broken somewhere (maybe
>> compiler-rt?), but asking for concrete bots distracts from the point that
>> there's something broken on every single commit, which makes the bot just
>> let you know that you committed something in the last few hours.
>>
>
> They also contain information about failures - yeah, they might not be
> yours, but they are often/usually someone's, not just flakey bot failures.
> If you're suggesting all the bots are unactionable - then perhaps we should
> turn off all notifications on all of them? I have certainly considered that
> - and then only enabling bots that are fast/high signal-to-noise/small
> blame list. Though I imagine that's a bigger discussion.
>
>
>> and the broken bots tend to have cycle times of several hours.
>>>>>>
>>>>>
>>>>> Long cycle times are a real problem - that might be best left to
>>>>> another discussion about buildbot maintenance - I would be for a policy
>>>>> that says bot windows shouldn't be longer than, say, an hour or maybe less.
>>>>> (so, eg: if you have a bot that's just going to take 5 hours to run - then
>>>>> you need 5 machines that each pickup work every hour, so the blame lists
>>>>> are smaller) this doesn't solve the problem of being notified 5 hours later
>>>>> about a breakage that was caused by someone else who committed a few
>>>>> minutes before or after you. Solving that problem will require a much
>>>>> greater investment in infrastructure to chain buildbots, possibly use built
>>>>> artefacts from one buildbot to another, etc.
>>>>>
>>>>>
>>>>>> So if you're on IRC and you commit something, you get pinged by
>>>>>> llvmbb for hours afterwards.
>>>>>>
>>>>>> Does anyone think llvmbb is useful?
>>>>>>
>>>>>
>>>>> I sometimes find it useful, but happy to move to llvm-build to get
>>>>> those notifications. Other folks might not know to do that, though.
>>>>>
>>>>>
>>>>>> The best thing about llvmbb I've heard it's easy to just "/ignore
>>>>>> llvmbb", but if that's what everybody does then why not not have it in the
>>>>>> first place?
>>>>>>
>>>>>> Nico
>>>>>> _______________________________________________
>>>>>> cfe-dev mailing list
>>>>>> cfe-dev at lists.llvm.org
>>>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>>>>>
>>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20200902/3923cec6/attachment-0001.html>


More information about the cfe-dev mailing list