[LLVMdev] Build bot fatigue

dblaikie at gmail.com dblaikie at gmail.com
Sun Dec 29 08:45:04 PST 2013


On Saturday, December 28, 2013 6:05:38 PM, Alp Toker <alp at nuanti.com> wrote:

My inbox has been filled with llvm.buildmaster<llvm.buildmaster at lab.llvm.org>
@ <llvm.buildmaster at lab.llvm.org>lab.llvm.org<llvm.buildmaster at lab.llvm.org>build
failure notifications lately.

The two problems appear to be:

  1) Getting notifications for breakage that was introduced by an
unrelated commit, often in a module I don't work on. Usually the
original committer is working on or has already landed the necessary fix.

  2) A cascade of dozens of notifications from various build servers
that continue to flood in over the course of 24 hours after the issue
was fixed.

These two conflate and produce a high signal-to-noise ratio, and in
practice you have to filter them out which means you no longer get a
ping on your phone when you need it.

Presumably a full fix is a non-trivial CI engineering problem, but are
there simple measures get the situation back under control?

Doesn't have to be perfect as long as it reduces the dozens of mails
every day to something more manageable. Ideas:

  1) Only send direct mail when the recipient is the single name in the
blame list.

  2) Set an In-Reply-To header in order to thread all failure
notifications related to a specific SVN revision. Most email clients
will let you silence the thread once you've confirmed the issue has been
resolved.

3) Or even simpler, don't send failure mail from any builders outside
the "fast" set? Otherwise the important failures blocking everyone's
work get drowned out in the noise.

Sorry to send a feature request without patches but I'm not familiar
with the CI infrastructure and this looks like a fairly recent
development (or is it just me?



This isn't new. Just how the boys have always worked.

The biggest thing would be to move boots over to the phased builder
infrastructure pioneered by apple (they use it internally and I believe
most of it has been upstreamed by Daniel Dunbar and David Tweed) that sets
up dependencies (eg: testing debug info depends on the compiler paying the
basic check first) and refuse/caching of build product (eg: use the output
of the basic checks to test the debug info, rather than rebuilding the
compiler on every builder).

This would reduce noise and increase build slave efficiency and granularity
to produce smaller blame lists.


Alp.

--
http://www.nuanti.com
the browser experts

_______________________________________________
LLVM Developers mailing list
LLVMdev at cs.uiuc.edu         http:// <http://llvm.cs.uiuc.edu>
llvm.cs.uiuc.edu
http:// <http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev>lists.cs.uiuc.edu<http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev>
/mailman/ <http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev>listinfo<http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev>
/ <http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev>llvmdev<http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20131229/ec316150/attachment.html>


More information about the llvm-dev mailing list