[cfe-dev] Buildbot General Failure - Production Stop?

Renato Golin via cfe-dev cfe-dev at lists.llvm.org
Mon Sep 5 14:48:32 PDT 2016


Folks,

As Nico and Diana investigated earlier [1], there was a change in Zorg
which made buildbots update one source directory (llvm.src) but build
from another (llvm), which made *all* builds from the same revision,
no matter the update.

Essentially, the bots were all lying when they said this or that
commit "passed", since they were still testing the same old commit.
All our bots were affected, and it seems many other Windows, PowerPC,
s390, Atom, etc.

I have worked around the problem now by making "llvm" as a symbolic
link to "llvm.src", so we build what we update and *many* of the bots
are coming back with a myriad of failures, which are most likely from
different commits in the last 4 days. This will take a while to clean
up... for all of us.

My question is: what do we do now?

The safest option would be to stop production, ie. block commits,
until the bots are reverted and then green. In a way, with all those
bots not testing anything, whatever we commit is *not* going to be
tested at all in a large part of our infrastructure, so I don't really
think there is a point in assuming we can continue committing at
will...

I don't remember this every happening in LLVM, that's why I'm
reluctant to propose it more strongly, but I see no better
alternative.

So, what now?

cheers,
--renato

[1] http://lists.llvm.org/pipermail/cfe-dev/2016-September/050651.html



More information about the cfe-dev mailing list