[llvm-dev] Flakey failure on clang-ppc64le-linux-multistage

Thu Sep 3 05:02:53 PDT 2020

Sure.
I didn't use lit or ninja. I simply copied the script produced by lit
(/home/buildbots/ppc64le-clang-multistage-test/clang-ppc64le-multistage/stage1/tools/clang/test/Driver/Output/target-override.c.script)
into a temporary directory (along with a deep copy of the build directory).
I modified the paths in the script to point to the temporary directory.
Then I ran the script in a loop.
For running a bunch in parallel, I just produced a wrapper script to invoke
that one:
target-override.c.script $LINENO &
target-override.c.script $LINENO &
target-override.c.script $LINENO &
...
wait
And ran that in a loop. For thousands of iterations...

On Wed, Sep 2, 2020 at 3:51 PM David Blaikie <dblaikie at gmail.com> wrote:

> Thanks for looking into it!
>
> Could you describe your test process in more detail? Were you running lit
> from your script? Running the build system (ninja?)?
>
> On Wed, Sep 2, 2020 at 10:47 AM Nemanja Ivanovic <nemanja.i.ibm at gmail.com>
> wrote:
>
>> Well, I am at my wit's end. I have copied over the script and directories
>> for this test case and run it a few million times. First I was running one
>> at a time, then I switched to kicking off 1000 at a time. All the while,
>> the bots continued to run on the same machine. The script never failed even
>> once. I am not sure if this has something to do with Python as part of
>> llvm-lit or what is going on.
>> I am thinking that the best course of action for us is to mark this test
>> case UNSUPPORTED for PPC.
>>
>> On Wed, Sep 2, 2020 at 12:41 PM Nemanja Ivanovic via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>>
>>> Interesting, thanks for bringing this to our attention. I just took a
>>> quick look through the last 100 builds and this test has failed 13 times.
>>> This is certainly something we need to look at. We will investigate and see
>>> if we can make any sense of this.
>>>
>>> Nemanja Ivanovic
>>> LLVM PPC Backend Development
>>> IBM Toronto Lab
>>> Email: nemanjai at ca.ibm.com
>>> Phone: 905-413-3388
>>>
>>>
>>>
>>> ----- Original message -----
>>> From: David Blaikie <dblaikie at gmail.com>
>>> To: llvm-dev <llvm-dev at lists.llvm.org>, Nico Weber <thakis at chromium.org>,
>>> Serge Pavlov <sepavloff at gmail.com>, powerllvm at ca.ibm.com
>>> Cc:
>>> Subject: [EXTERNAL] Flakey failure on clang-ppc64le-linux-multistage
>>> Date: Tue, Sep 1, 2020 6:10 PM
>>>
>>> Seems there were a couple of correlated failures that appear to be
>>> flakes on this buildbot recently:
>>>
>>> green:
>>> http://lab.llvm.org:8011/builders/clang-ppc64le-linux-multistage/builds/13974
>>> red:
>>> http://lab.llvm.org:8011/builders/clang-ppc64le-linux-multistage/builds/13975 (target-override.c
>>> during stage 1, seems to be missing the directory/symlink it just created)
>>> red:
>>> http://lab.llvm.org:8011/builders/clang-ppc64le-linux-multistage/builds/13976 (same
>>> test failure as the last, but during stage 2, not stage 1)
>>> green:
>>> http://lab.llvm.org:8011/builders/clang-ppc64le-linux-multistage/builds/13977
>>>
>>> Including Nico & Pavlov as the people who wrote/edited the test, but I'm
>>> guessing this is something interesting going on on the buildbot itself?
>>>
>>> powerllvm at ca.ibm.com, whoever you are on the end of that mailing list -
>>> could you take a look at this? Possibly manually running that test in a
>>> loop a bunch of times to see if it fails sometimes & try to help us
>>> understand why?
>>>
>>>
>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200903/cf98252b/attachment-0001.html>