[llvm-dev] Responsibilities of a buildbot owner

Sat Jan 8 12:06:40 PST 2022

Hey all,

I have a couple of questions about what the responsibilities of a buildbot owner are. I’ve been maintaining a couple of buildbots for lldb and mlir for some time now and I thought I had a pretty good idea of what is required based on the documentation here: How To Add Your Build Configuration To LLVM Buildbot Infrastructure — LLVM 13 documentation<https://www.llvm.org/docs/HowToAddABuilder.html>

My understanding was that there are some things that are *expected* of the owner. Namely:

  1.  Make sure that the buildbot is connected and has the right infrastructure (e.g. the right version of Python, or tools, etc.). Update as needed.
  2.  Make sure that the build configuration is one that is supported (e.g. supported flavor or cmake variables). Update as needed.

There are also a couple of things that are *optional*, but nice to have:

  1.  If the buildbot stays red for a while (where “a while” is completely subjective), figure out the patch or patches that are causing an issue and either revert them or notify the authors, so they can take action.
  2.  If someone is having trouble investigating a failure that only happens on the buildbot (or the buildbot is a rare configuration), help them out (e.g. collect logs if possible).

Up to now, I’ve not had any issues with this and the community has been very good at fixing issues with builds and tests when I point them out, or more often than not, without me having to do anything but the occasional test re-run and software update (like this one, for example, ⚙ D114639 Raise the minimum Visual Studio version to VS2019 (llvm.org)<https://reviews.llvm.org/D114639>). lldb has some tests that are flaky because of the nature of the product, so there is some noise, but mostly things work well and everyone seems happy.

I’ve recently run into a situation that makes me wonder whether there are other expectations of a buildbot owner that are not explicitly listed in the llvm documentation. Someone reached out to me some time ago to let me know their unhappiness at the flakiness of some of the lldb tests and demanded that I either fix them or disable them. I let them know that there are some tests that are known to be flaky, that my expectation is that it is not my responsibility to fix all such issues and that the community would be very happy to have their contribution in the form of a fix or a change to disable the tests. I didn’t get a response from this person, but I did disable a couple of particularly flaky tests since it seemed like the nice thing to do.

The real excitement happened yesterday when I received an email that *the build bot had been turned off*. This same person reached out to the powers that be (without letting me know) and asked them explicitly to silence it *without my active involvement* because of the flakiness.

I have a couple of issues with this approach but perhaps I’ve misunderstood what my responsibilities are as the buildbot owner. I know it is frustrating to see a bot fail because of flaky tests and it is nice to have someone to ask to resolve them all – is that really the expectation of a buildbot owner? Where is the line between maintenance of the bot and fixing build and test issues for the community?

I’d like to understand what the general expectations are and if there are things missing from the documentation, I propose that we add them, so that it is clear for everyone what is required.

Thanks,
-Stella

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20220108/c28110cc/attachment.html>