[PATCH] D22305: [BasicAA] Strip phi nodes, when all incoming values are the same.

Fri Jul 15 11:27:53 PDT 2016

 Or do we need (2) because in general this kind of phi node is seen in many
programs?

I think I should rephrase this as: Or are we aware of other cases that we
miss optimizations because basicaa cannot see through phis.

From:	Ehsan A Amiri/Toronto/IBM
To:	Daniel Berlin <dberlin at dberlin.org>
Cc:	Hal Finkel <hfinkel at anl.gov>, llvm-commits
            <llvm-commits at lists.llvm.org>, reviews+D22305+public
            +92ca108e50bc4651 at reviews.llvm.org
Date:	07/15/2016 02:18 PM
Subject:	Re: [PATCH] D22305: [BasicAA] Strip phi nodes, when all
            incoming values are the same.

Another question for me: With the fix (3) are we going to still have cases
that GVN misses for which we need (2) ? Or do we need (2) because in
general this kind of phi node is seen in many programs? Or something else
that I do not see....

Thanks
Ehsan

From:	Daniel Berlin <dberlin at dberlin.org>
To:	Ehsan A Amiri/Toronto/IBM at IBMCA
Cc:	Hal Finkel <hfinkel at anl.gov>, llvm-commits
            <llvm-commits at lists.llvm.org>, reviews+D22305+public
            +92ca108e50bc4651 at reviews.llvm.org
Date:	07/14/2016 11:08 PM
Subject:	Re: [PATCH] D22305: [BasicAA] Strip phi nodes, when all
            incoming values are the same.

On Thu, Jul 14, 2016 at 7:49 PM, Ehsan A Amiri <amehsan at ca.ibm.com> wrote:
  I can measure compile time impact of (2) and (3) on power. A couple of
  questions/remarks.

  1- As for (2) I don't think there will be a significant compile time
  cost.

Compile time does not get slower 5% at a time, it gets slower 0.1% or 0.05%
at a time.

It's not "someone adds thing that is super slow", it is "people add lots of
little things to lots of functions called lots of times".

As Hal points out, we never cache that we simplified or didn't, or checked
the phi or didn't, or *anything*, because BasicAA is stateless.

Because BasicAA is stateless, we will do this again and again and again
(both this phi checking and simplifyinstruction and ....)

This adds up.

  If we do not reduce the phi to its unique incoming value, we will end up
  in aliasPHI and we do this check there. It is just too late for some
  cases like the problem here, and we get conservative result. Do I miss
  something here?

The point is that we are going to try to simplify this, and every phi, a
lot of times, both in AA, and elsewhere.

  2- I will use Daniel's patch for (3). Depending on how expensive it is,
  we can look at finer grain cache clean ups, as Hal suggests.

  3- I will modify my patch for (2) and modify stripPointerCasts to include
  "seeing through phi". I wanted to separate that move as a next step, but
  Hal disagreed on the code review.

  If you disagree with this way of doing the experiments, please let me
  know.

  Thanks
  Ehsan

  Inactive hide details for Hal Finkel ---07/14/2016 09:35:07 PM--------
  Original Message ----- > From: "Daniel Berlin" <dberlin at Hal Finkel
  ---07/14/2016 09:35:07 PM-------- Original Message ----- > From: "Daniel
  Berlin" <dberlin at dberlin.org>

  From: Hal Finkel <hfinkel at anl.gov>
  To: Daniel Berlin <dberlin at dberlin.org>
  Cc: llvm-commits <llvm-commits at lists.llvm.org>, <reviews+D22305+public
  +92ca108e50bc4651 at reviews.llvm.org>, Ehsan A Amiri/Toronto/IBM at IBMCA
  Date: 07/14/2016 09:35 PM

  Subject: Re: [PATCH] D22305: [BasicAA] Strip phi nodes, when all incoming
  values are the same.

  From: "Daniel Berlin" <dberlin at dberlin.org>
  To: "Hal Finkel" <hfinkel at anl.gov>
  Cc: "llvm-commits" <llvm-commits at lists.llvm.org>, reviews+D22305+public
  +92ca108e50bc4651 at reviews.llvm.org, "Ehsan A Amiri" <amehsan at ca.ibm.com>
  Sent: Thursday, July 14, 2016 8:30:09 PM
  Subject: Re: [PATCH] D22305: [BasicAA] Strip phi nodes, when all incoming
  values are the same.

        (2) is, in theory, the right thing to do. Even if we were to
        consider uniform PHIs to be anti-canonical, and thus something
        which should be simplified, we can't simplify often enough to
        prevent these from blocking useful analysis work.

  FWIW, i'm fine with this approach if our approach is going to be as you
  say - that we will not simplify often enough.

  Right now, as i said, we simplify *everywhere*, and every one of those
  calls will eliminate this phi node.

  So it's only *this particular path* that misses all those calls.

  For example, if the alias check had gone through a gep of a phi anywhere,
  it would simplify the phi as part of getunderlyingobject, etc.
        Arbitrary uses of RAUW can create these PHIs, and we can't (and
        probably shouldn't) run InstCombine in between every other pass.
        This is a local pattern that stripPointerCasts, and similar
        functions, can look through.

  Fine with this as long as we maybe stop trying to simplify instructions 8
  or 9 times, and instead do it once (max) per instructions, and make this
  part of it.

  (IE This would mean we would have SimplifyAndGetUnderlyingObject and
  GetUnderlyingObject, and we simplify once and call the latter or
  something, or whatever. Not suggesting we decide this second, just
  suggesting that we agree if this is going to be our general approach).
  This is hard because we don't cache the simplifications in any way. It is
  not like we're updating the IR when we discover some simplification;
  we're only using the simplified version in place. I'm not sure how to fix
  this. Maybe we should run InstSimplify a lot more often. It is not as
  expensive was InstCombine by a large margin (IIRC).

        (3) is also, in theory, the right thing to do. The memdep cache, by
        necessity, caches negative results. Each GVN iteration, however,
        performs "information revealing" operations which can make the
        cached results more conservative than a new query's results.

  Yes.
  It's theoretically possible to make it less expensive than blowing away
  the whole cache, but so far, experience has told me that fully
  maintaining the cache becomes slower than redoing the queries :P
  I'm not sure how much this helps, but we could/should only clear out the
  MayAlias results.

  -Hal

        Now we actually need to measure the costs.

        -Hal

        (Note: i've attached a patch for 3 in case anyone wants to see the
        compile time cost)

        On Thu, Jul 14, 2016 at 2:12 PM, Daniel Berlin <dberlin at dberlin.org
        > wrote:
        Note: This already had GVN run once on it, do you have the one
        before that?

        On Thu, Jul 14, 2016 at 1:53 PM, Ehsan A Amiri <amehsan at ca.ibm.com>
        wrote:
              I can see the problem with the following command line and
              attached file

              opt -gvn t.ll

              (See attached file: t.ll)

              My clang is almost a week old.

              Inactive hide details for Daniel Berlin ---07/14/2016
              04:27:43 PM---Actually, can you please attach a .ll file and
              an opt commaDaniel Berlin ---07/14/2016 04:27:43
              PM---Actually, can you please attach a .ll file and an opt
              command line that reproduces the problem?

              From: Daniel Berlin <dberlin at dberlin.org>
              To: Ehsan A Amiri/Toronto/IBM at IBMCA
              Cc: Hal Finkel <hfinkel at anl.gov>, llvm-commits <
              llvm-commits at lists.llvm.org>, reviews+D22305+public
              +92ca108e50bc4651 at reviews.llvm.org
              Date: 07/14/2016 04:27 PM
              Subject: Re: [PATCH] D22305: [BasicAA] Strip phi nodes, when
              all incoming values are the same.

              Actually, can you please attach a .ll file and an opt command
              line that reproduces the problem?
              The clang command line you have is very sensitive to
              versions, etc.

              I cannot get your issue to reproduce with the clang i have
              installed that can target powerpc-linux (and the issue does
              not reproduce with your testcase on x86) :)

              While debugging a bit, note that there is at least one
              obvious bug in GVN that may affect this, by inspection:

              When GVN splits a critical edge, it never adds the new block
              to the iteration order (at all), even though it inserts into
              it.
              So they will not get processed until the next iteration of
              GVN on the function, even though they have code in them.
              While this is okay from a correctness standpoint, it may
              block optimization of certain things (including the cases
              you've discovered). In practice, there is no way to perfectly
              solve that without pre-splitting all critical edges, but you
              should get the same effect if we throw the critical edge
              block and then it's successors (including the current blcok)
              into bbvect after the current block again.

              It is also missing a real phi simplification.
              While simplifyinstruction will check if all arguments are
              trivially the same, that is not the real test that should be
              performed.

              It should be doing VN.lookup on each argument and seeing if
              they come up with the the same value number.

              Once you attach the .ll file, i'll fix these and see if it
              fixes your testcase, and if not, debug further.

              On Thu, Jul 14, 2016 at 10:11 AM, Daniel Berlin <
              dberlin at dberlin.org> wrote:

                          On Thu, Jul 14, 2016 at 7:09 AM, Ehsan A Amiri <
                          amehsan at ca.ibm.com> wrote:
                          This is order of events:
                          This order cannot be correct if the solution I
                          gave (or adding simplification to *some place in
                          GVN*) does not work. Or GVN is broken in other
                          ways.

                          1) GVN starts looking at the function. At this
                          point the phi node has two different incoming
                          values.
                          2) GVN performs an RAUW. The phi is converted to
                          the one that has two identincal incoming values.

                          At this point, it should now process the phi
                          instruction again before it processes the load,
                          because it is doing a reverse postorder
                          traversal.
                          When it did that, the phi should have been
                          simplified
                          So why did that not happen?

                          Given the complexity of fixing the real problem,

                          Look, i understand why you want to just fix this
                          in AA and be done with it.
                          Really, I do.

                          I understand you have spent a lot of time on this
                          bug, and I greatly appreciate that.
                          But I really want to understand what is going on
                          before we try to actually fix it.
                          I have a good understanding of what happens once
                          the bad answer gets into memdep (and thank you
                          for that!), but i still have trouble seeing why
                          it lived long enough to get there.

                          To that end, so you don't have to spend more time
                          running around for me, i'll take over this bug,
                          and either figure out why GVN lets this PHI live
                          to the point it gets an AA query about it (and
                          fix it/decide it can't be fixed), or commit the
                          AA patch for you if we decide it can't be fixed.

                          My ETA is by friday.

                          I assume the testcase in the bug is the one we
                          are still using, right?
                          (If not, if you can attach it, that would be
                          helpful)

        --
        Hal Finkel
        Assistant Computational Scientist
        Leadership Computing Facility
        Argonne National Laboratory

  --
  Hal Finkel
  Assistant Computational Scientist
  Leadership Computing Facility
  Argonne National Laboratory
  [attachment "graycol.gif" deleted by Ehsan A Amiri/Toronto/IBM]

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20160715/ba04b053/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20160715/ba04b053/attachment.gif>