[llvm-dev] Flakey failure on clang-ppc64le-linux-multistage

Nico Weber via llvm-dev llvm-dev at lists.llvm.org
Thu Sep 3 15:23:19 PDT 2020


I think that was maybe the discussion on https://reviews.llvm.org/D78245

On Thu, Sep 3, 2020 at 6:22 PM Robinson, Paul <paul.robinson at sony.com>
wrote:

> I have a vague memory that libcxx wanted it for something, and claimed it
> would be hard to work around not having it.
>
> Anyone else remember that?  I can’t dredge up the details, sorry…
>
> In any event, a separate properly-titled thread on llvm-dev would be the
> right way to decide this.
>
> --paulr
>
>
>
> *From:* llvm-dev <llvm-dev-bounces at lists.llvm.org> *On Behalf Of *Nico
> Weber via llvm-dev
> *Sent:* Thursday, September 3, 2020 4:16 PM
> *To:* David Blaikie <dblaikie at gmail.com>
> *Cc:* llvm-dev <llvm-dev at lists.llvm.org>; LLVM on Power <
> powerllvm at ca.ibm.com>; Nemanja Ivanovic <nemanjai at ca.ibm.com>
> *Subject:* Re: [llvm-dev] Flakey failure on clang-ppc64le-linux-multistage
>
>
>
> https://llvm.org/docs/CommandGuide/lit.html
> <https://urldefense.com/v3/__https:/llvm.org/docs/CommandGuide/lit.html__;!!JmoZiZGBv3RvKRSx!pMM0AcKS3gRL1wx2OJk-DMZG6KNuO3f602ILYnDX01_Q_Se_K_tNOHEg9mWS8jKYBA$>
> already lists %T as "parent directory of %t (not unique, deprecated, do not
> use)". See also https://reviews.llvm.org/D35396
> <https://urldefense.com/v3/__https:/reviews.llvm.org/D35396__;!!JmoZiZGBv3RvKRSx!pMM0AcKS3gRL1wx2OJk-DMZG6KNuO3f602ILYnDX01_Q_Se_K_tNOHEg9mVseLHLGw$>
>
>
>
> On Thu, Sep 3, 2020 at 3:37 PM David Blaikie <dblaikie at gmail.com> wrote:
>
> Yeah, I think I'd be up for considering deprecation of %T due to the risk
> of race conditions/conflicts between tests. %t gives a unique name you can
> do whatever you want with - only need one file, use %t as a file, need a
> directory full of files, mkdir %t and use that, etc.
>
> But will depend a bit on what the uses of %T look like, maybe there are
> some good uses of it that we haven't thought of until we see them.
>
>
>
> On Thu, Sep 3, 2020 at 12:33 PM Fāng-ruì Sòng <maskray at google.com> wrote:
>
> Should be fixed by https://reviews.llvm.org/D87103
> <https://urldefense.com/v3/__https:/reviews.llvm.org/D87103__;!!JmoZiZGBv3RvKRSx!pMM0AcKS3gRL1wx2OJk-DMZG6KNuO3f602ILYnDX01_Q_Se_K_tNOHEg9mXzlKTnBw$>
>
> Shall we consider deprecating(emitting a warning)/removing %T from
> lit? lldb, lld/COFF and clang-tools-extra are the three major users of
> %T. There are a few other %T in other places but there are not too
> many. We will also investigate whether other projects using lit are
> using %T.
>
> On Thu, Sep 3, 2020 at 11:25 AM David Blaikie <dblaikie at gmail.com> wrote:
> >
> > Oh yeah, good catch! Thanks!
> >
> > On Thu, Sep 3, 2020 at 11:13 AM Fāng-ruì Sòng <maskray at google.com>
> wrote:
> >>
> >> This is likely due to a race condition (%T is a shared parent
> >> directory). I'll put up a patch to fix it.
> >>
> >> On Thu, Sep 3, 2020 at 10:00 AM David Blaikie via llvm-dev
> >> <llvm-dev at lists.llvm.org> wrote:
> >> >
> >> > Is the machine running any jobs in parallel? Would it be worth trying
> running lit in the loop, rather than the script? (perhaps lit's doing
> something interesting) or maybe the full test run from ninja, but I
> appreciate that that is expensive.
> >> >
> >> > Are there other PPC bots? Any idea if they are experiencing this
> failure?
> >> >
> >> > There are also other tests that do similar mkdir/symlink things, I
> think - yet they are not failing? Maybe they do it in some slightly
> different manner?
> >> >
> >> > On Thu, Sep 3, 2020 at 5:03 AM Nemanja Ivanovic <
> nemanja.i.ibm at gmail.com> wrote:
> >> >>
> >> >> Sure.
> >> >> I didn't use lit or ninja. I simply copied the script produced by
> lit
> (/home/buildbots/ppc64le-clang-multistage-test/clang-ppc64le-multistage/stage1/tools/clang/test/Driver/Output/target-override.c.script)
> into a temporary directory (along with a deep copy of the build directory).
> I modified the paths in the script to point to the temporary directory.
> >> >> Then I ran the script in a loop.
> >> >> For running a bunch in parallel, I just produced a wrapper script to
> invoke that one:
> >> >> target-override.c.script $LINENO &
> >> >> target-override.c.script $LINENO &
> >> >> target-override.c.script $LINENO &
> >> >> ...
> >> >> wait
> >> >> And ran that in a loop. For thousands of iterations...
> >> >>
> >> >> On Wed, Sep 2, 2020 at 3:51 PM David Blaikie <dblaikie at gmail.com>
> wrote:
> >> >>>
> >> >>> Thanks for looking into it!
> >> >>>
> >> >>> Could you describe your test process in more detail? Were you
> running lit from your script? Running the build system (ninja?)?
> >> >>>
> >> >>> On Wed, Sep 2, 2020 at 10:47 AM Nemanja Ivanovic <
> nemanja.i.ibm at gmail.com> wrote:
> >> >>>>
> >> >>>> Well, I am at my wit's end. I have copied over the script and
> directories for this test case and run it a few million times. First I was
> running one at a time, then I switched to kicking off 1000 at a time. All
> the while, the bots continued to run on the same machine. The script never
> failed even once. I am not sure if this has something to do with Python as
> part of llvm-lit or what is going on.
> >> >>>> I am thinking that the best course of action for us is to mark
> this test case UNSUPPORTED for PPC.
> >> >>>>
> >> >>>> On Wed, Sep 2, 2020 at 12:41 PM Nemanja Ivanovic via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
> >> >>>>>
> >> >>>>> Interesting, thanks for bringing this to our attention. I just
> took a quick look through the last 100 builds and this test has failed 13
> times. This is certainly something we need to look at. We will investigate
> and see if we can make any sense of this.
> >> >>>>>
> >> >>>>> Nemanja Ivanovic
> >> >>>>> LLVM PPC Backend Development
> >> >>>>> IBM Toronto Lab
> >> >>>>> Email: nemanjai at ca.ibm.com
> >> >>>>> Phone: 905-413-3388 <(905)%20413-3388>
> >> >>>>>
> >> >>>>>
> >> >>>>>
> >> >>>>> ----- Original message -----
> >> >>>>> From: David Blaikie <dblaikie at gmail.com>
> >> >>>>> To: llvm-dev <llvm-dev at lists.llvm.org>, Nico Weber <
> thakis at chromium.org>, Serge Pavlov <sepavloff at gmail.com>,
> powerllvm at ca.ibm.com
> >> >>>>> Cc:
> >> >>>>> Subject: [EXTERNAL] Flakey failure on
> clang-ppc64le-linux-multistage
> >> >>>>> Date: Tue, Sep 1, 2020 6:10 PM
> >> >>>>>
> >> >>>>> Seems there were a couple of correlated failures that appear to
> be flakes on this buildbot recently:
> >> >>>>>
> >> >>>>> green:
> http://lab.llvm.org:8011/builders/clang-ppc64le-linux-multistage/builds/13974
> <https://urldefense.com/v3/__http:/lab.llvm.org:8011/builders/clang-ppc64le-linux-multistage/builds/13974__;!!JmoZiZGBv3RvKRSx!pMM0AcKS3gRL1wx2OJk-DMZG6KNuO3f602ILYnDX01_Q_Se_K_tNOHEg9mXw4VrwUw$>
> >> >>>>> red:
> http://lab.llvm.org:8011/builders/clang-ppc64le-linux-multistage/builds/13975
> <https://urldefense.com/v3/__http:/lab.llvm.org:8011/builders/clang-ppc64le-linux-multistage/builds/13975__;!!JmoZiZGBv3RvKRSx!pMM0AcKS3gRL1wx2OJk-DMZG6KNuO3f602ILYnDX01_Q_Se_K_tNOHEg9mXhob0Wcg$>
> (target-override.c during stage 1, seems to be missing the
> directory/symlink it just created)
> >> >>>>> red:
> http://lab.llvm.org:8011/builders/clang-ppc64le-linux-multistage/builds/13976
> <https://urldefense.com/v3/__http:/lab.llvm.org:8011/builders/clang-ppc64le-linux-multistage/builds/13976__;!!JmoZiZGBv3RvKRSx!pMM0AcKS3gRL1wx2OJk-DMZG6KNuO3f602ILYnDX01_Q_Se_K_tNOHEg9mU1HOQs2Q$>
> (same test failure as the last, but during stage 2, not stage 1)
> >> >>>>> green:
> http://lab.llvm.org:8011/builders/clang-ppc64le-linux-multistage/builds/13977
> <https://urldefense.com/v3/__http:/lab.llvm.org:8011/builders/clang-ppc64le-linux-multistage/builds/13977__;!!JmoZiZGBv3RvKRSx!pMM0AcKS3gRL1wx2OJk-DMZG6KNuO3f602ILYnDX01_Q_Se_K_tNOHEg9mVp5e-Lnw$>
> >> >>>>>
> >> >>>>> Including Nico & Pavlov as the people who wrote/edited the test,
> but I'm guessing this is something interesting going on on the buildbot
> itself?
> >> >>>>>
> >> >>>>> powerllvm at ca.ibm.com, whoever you are on the end of that mailing
> list - could you take a look at this? Possibly manually running that test
> in a loop a bunch of times to see if it fails sometimes & try to help us
> understand why?
> >> >>>>>
> >> >>>>>
> >> >>>>>
> >> >>>>> _______________________________________________
> >> >>>>> LLVM Developers mailing list
> >> >>>>> llvm-dev at lists.llvm.org
> >> >>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> <https://urldefense.com/v3/__https:/lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev__;!!JmoZiZGBv3RvKRSx!pMM0AcKS3gRL1wx2OJk-DMZG6KNuO3f602ILYnDX01_Q_Se_K_tNOHEg9mUugA-Hgw$>
> >> >
> >> > _______________________________________________
> >> > LLVM Developers mailing list
> >> > llvm-dev at lists.llvm.org
> >> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> <https://urldefense.com/v3/__https:/lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev__;!!JmoZiZGBv3RvKRSx!pMM0AcKS3gRL1wx2OJk-DMZG6KNuO3f602ILYnDX01_Q_Se_K_tNOHEg9mUugA-Hgw$>
> >>
> >>
> >>
> >> --
> >> 宋方睿
>
>
>
> --
> 宋方睿
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200903/1501d428/attachment.html>


More information about the llvm-dev mailing list