[LLVMdev] Build bot fatigue

Alp Toker alp at nuanti.com
Sat Dec 28 18:03:30 PST 2013


My inbox has been filled with llvm.buildmaster at lab.llvm.org build 
failure notifications lately.

The two problems appear to be:

  1) Getting notifications for breakage that was introduced by an 
unrelated commit, often in a module I don't work on. Usually the 
original committer is working on or has already landed the necessary fix.

  2) A cascade of dozens of notifications from various build servers 
that continue to flood in over the course of 24 hours after the issue 
was fixed.

These two conflate and produce a high signal-to-noise ratio, and in 
practice you have to filter them out which means you no longer get a 
ping on your phone when you need it.

Presumably a full fix is a non-trivial CI engineering problem, but are 
there simple measures get the situation back under control?

Doesn't have to be perfect as long as it reduces the dozens of mails 
every day to something more manageable. Ideas:

  1) Only send direct mail when the recipient is the single name in the 
blame list.

  2) Set an In-Reply-To header in order to thread all failure 
notifications related to a specific SVN revision. Most email clients 
will let you silence the thread once you've confirmed the issue has been 
resolved.

3) Or even simpler, don't send failure mail from any builders outside 
the "fast" set? Otherwise the important failures blocking everyone's 
work get drowned out in the noise.

Sorry to send a feature request without patches but I'm not familiar 
with the CI infrastructure and this looks like a fairly recent 
development (or is it just me?)

Alp.


-- 
http://www.nuanti.com
the browser experts




More information about the llvm-dev mailing list