[PATCH] D55851: Implement basic loop fusion pass

Wed Feb 13 09:41:22 PST 2019

dmgreen added inline comments.

================
Comment at: llvm/lib/Transforms/Scalar/LoopFuse.cpp:858
+
+      // TODO: Can we actually use the dependence info analysis here?
+      return false;
----------------
kbarton wrote:
> dmgreen wrote:
> > kbarton wrote:
> > > dmgreen wrote:
> > > > I would imagine, although I'm not sure, that there would at least be a lot of bugs here. We are dealing with different loops, but we can say that they are very similar.
> > > > 
> > > > What does it currently give? Anything useful? The SCEV checks you are doing above is obviously quite similar to what DA would be trying to do, but with the added loop rewrite. It would be a shame to duplicate all the effort but may, I guess, be necessary if DA doesn't do well enough.
> > > At this point DA doesn't give anything useful - at least not for the test cases that I have tried. I have not had a chance to investigate why, or if there is a better/different way to do things where it can be useful (which is why I marked the TODO here). 
> > > 
> > > The one thing that DA is able to do, that SCEV currently cannot, is understand the restrict keyword and accurately identify no dependencies between the two loops in this case. Perhaps it would be better to try and teach SCEV about restrict, and then only rely on SCEV in the long run?
> > > 
> > I remember DA having problems on 64bit systems due to all the sexts that kept happening. Delinearising constant's too, maybe? There are certainly problems in it that would be worth trying to fix (or find a replacement to do it better).
> > 
> > The delinearisation that DA will attempt may also be helpful. It is nowadays all built on SCEV's too, so should in theory (baring the facts that we know here about dealing with different but similar loops) be a more powerful version of the SCEV code here. That power might be confusing it at the moment though.
> > 
> > I would expect that better DA is something llvm will need sooner or later as more loop optimisations are implemented. Perhaps this is something that we can improve in the future, relying on the SCEV code at the moment, but getting more help from DA as it is improved. The no-aliasing checks it can do are useful at least, it would seem.
> I completely agree about improving DA and the need for it with other loop optimizations. I still haven't had a chance to look at it in detail, but would like to start looking at it once this patch is finalized and lands.
> 
> For now are you OK with the current usage of DA here? I'm hoping that as we extend/improve it, we can modify this code to take advantage of it.
Sounds good to me. The !DepResult check can help in some cases at least, and the rest we can add as DA is improved.

================
Comment at: llvm/lib/Transforms/Scalar/LoopFuse.cpp:917
+          return false;
+        }
+
----------------
kbarton wrote:
> dmgreen wrote:
> > I'm not sure if this is quite strong enough. Consider something like this, where the sum would be used in the second block, but not as phi:
> >   for(i = 0; i < n; i++)
> >     sum += a[i];
> >   for(i = 0; i < n; i++)
> >     b[i] = a[i]/sum;
> > These can be fused, but use the wrong "sum" on each iteration of the loop. Unroll and jam side-steps all this by requiring lcssa nodes (and, you know, not requiring generalised loop fusion :) )
> Yes, you're right. If I generate the ll for this example and massage it to make it eligible for fusion, we will fuse these.
> However, I'm confused by your statement: 
> 
> > These can be fused, but use the wrong "sum" on each iteration of the loop. 
> 
> I don't see how fusion is legal here. Are you saying that the current check will (incorrectly) still allow fusion? Or there is a way to fuse these loops?
> 
> For this example, lcssa will create a non-empty preheader for the second loop and the earlier checks preventing fusion will bail before we get to this test. If I manually sink the PHI from lcssa into the header of the second loop, I can create an empty preheader for the second loop to allow fusion to continue. 
> 
> At any rate, I think this check needs to be strengthened to prevent fusion from happening in this case. Do you agree?
Yeah, sorry. I meant _will_ fuse, but shouldn't. "Can" was a bit misleading.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D55851/new/

https://reviews.llvm.org/D55851