[PATCH] D55851: Implement basic loop fusion pass

Kit Barton via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Jan 17 13:58:31 PST 2019


kbarton marked 15 inline comments as done.
kbarton added a comment.

I've addressed most of these comments (except the ones I have some questions about). I will be refreshing the patch momentarily.



================
Comment at: llvm/lib/Transforms/Scalar/LoopFuse.cpp:719
+
+          LLVM_DEBUG(dbgs()
+                     << "Candidate Set (after inserting TargetCandidate):\n"
----------------
dmgreen wrote:
> This debug message is (essentially) repeated?
Good catch. Fixed. 


================
Comment at: llvm/lib/Transforms/Scalar/LoopFuse.cpp:858
+
+      // TODO: Can we actually use the dependence info analysis here?
+      return false;
----------------
dmgreen wrote:
> I would imagine, although I'm not sure, that there would at least be a lot of bugs here. We are dealing with different loops, but we can say that they are very similar.
> 
> What does it currently give? Anything useful? The SCEV checks you are doing above is obviously quite similar to what DA would be trying to do, but with the added loop rewrite. It would be a shame to duplicate all the effort but may, I guess, be necessary if DA doesn't do well enough.
At this point DA doesn't give anything useful - at least not for the test cases that I have tried. I have not had a chance to investigate why, or if there is a better/different way to do things where it can be useful (which is why I marked the TODO here). 

The one thing that DA is able to do, that SCEV currently cannot, is understand the restrict keyword and accurately identify no dependencies between the two loops in this case. Perhaps it would be better to try and teach SCEV about restrict, and then only rely on SCEV in the long run?



================
Comment at: llvm/lib/Transforms/Scalar/LoopFuse.cpp:1003
+                                                          FC1.Header);
+    TreeUpdates.push_back(
+        {DominatorTree::Delete, FC0.ExitingBlocks, FC1.Preheader});
----------------
dmgreen wrote:
> It can sometimes be better to insert edges into the DT before deleting old ones. It keep the tree more intact (especially pdts), with less subtrees becoming unreachable and less recalculations needed. It means it can be simpler and quicker for the updates.
OK, that's good to know, thanks.

Is this specifically for inserting/deleting edges between similar blocks, or is this for all inserts/deletes in the entire tree? In other words, should I rearrange this code, and the code below (lines 1048-1051) that updates the latch blocks to do all the insertions before any deletions?


================
Comment at: llvm/lib/Transforms/Scalar/LoopFuse.cpp:1015
+    // Moves the phi nodes from the second to the first loops header block.
+    while (PHINode *PHI = dyn_cast<PHINode>(&FC1.Header->front())) {
+      if (SE.isSCEVable(PHI->getType()))
----------------
dmgreen wrote:
> Is there anywhere that we check that these won't have phi operands from inside the first loop? i.e. that the phis can be moved into the first loop / before all of it's instructions.
This is a really good catch!] We completely missed this.
I've added a check in dependencesAllowFusion that walks through the PHIs in the header of FC1 and check if the incoming block is in FC0 then it is the header block. If it is not, then we do not fuse. 


================
Comment at: llvm/lib/Transforms/Scalar/LoopFuse.cpp:1087
+    assert(!verifyFunction(*FC0.Header->getParent(), &errs()));
+    assert(DT.verify());
+    assert(PDT.verify());
----------------
dmgreen wrote:
> You may want to add DominatorTree::VerificationLevel::Fast otherwise this might be quite slow.
Thanks for pointing this out. 
I've changed to fast.
Most (probably all) of the problems this has found has been done by comparing to a newly constructed tree, so I think Fast should be sufficient. 


================
Comment at: llvm/lib/Transforms/Scalar/LoopFuse.cpp:1118
+
+    AU.addPreserved<ScalarEvolutionWrapperPass>();
+    AU.addPreserved<DominatorTreeWrapperPass>();
----------------
dmgreen wrote:
> Looks like the code is preserving LI too (I'm not sure if you can/should preserve SE without preserving LI.)
Yup, good call. I've added LoopInfoWrapperPass to the preserved list. 


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D55851/new/

https://reviews.llvm.org/D55851





More information about the llvm-commits mailing list