[PATCH] D13363: [DeadStoreElimination] Add support for non-local DSE

Ivan Baev via llvm-commits llvm-commits at lists.llvm.org
Wed Oct 28 12:17:27 PDT 2015


ivanbaev added a comment.

Hi Mitch, thanks for the thoughtful comments; we are making a good progress.
I will make the two changes: using successor iterator and adding  the comment, in the next revision.

Regarding Bruno's request for LNT testing, I'm currently running LNT tests with ARM/AArch64 compiler without and with this patch, and will report results when finished. Could someone run this patch with an x86-based compiler if there are concerns about compile-time.

Regarding David's comments, I believe all have been addressed, except caching DataLayout in the constructor.
There are many places in DSE where DataLayout is used or passed as argument. Since most transformations in Transform/*, e.g. Vectorize/LoopVectorize.cpp, Scalar/SROA.cpp, Scalar/GVN.cpp, favor accessing DataLayout this way, I suggest this change to be done in a future patch.

Mitch, Erik, David, Bruno,
Are you planning to attend the LLVM developers meeting on Thu/Fri? That would be a good opportunity to discuss this patch in person if you would like to.

Ivan


================
Comment at: lib/Transforms/Scalar/DeadStoreElimination.cpp:653
@@ +652,3 @@
+      }
+    } else if (InstDep.isNonLocal()) { // DSE across BB
+      if (++NumNonLocalAttempts < MaxNonLocalAttempts)
----------------
mbodart wrote:
> The self read issue has not been addressed.
It looks like that some comments are not aligned to the corresponding lines of code.
The self read issue has been addressed in line 888 by adding there a call to: 
     isPossibleSelfRead(Inst, Loc, Dependency, *TLI, *AA)
It is similar to the local case.

================
Comment at: lib/Transforms/Scalar/DeadStoreElimination.cpp:654
@@ +653,3 @@
+    } else if (InstDep.isNonLocal()) { // DSE across BB
+      if (++NumNonLocalAttempts < MaxNonLocalAttempts)
+        MadeChange |= handleNonLocalDependency(Inst);
----------------
mbodart wrote:
> ivanbaev wrote:
> > MemoryDependenceAnalysis has BlockScanLimit = 100 and NumResultsLimit = 100. With these and the new MaxNonLocalAttempts = 100 we have enough handles to control compile-time. 
> I see.  Thanks for pointing that out.
> Those seem reasonable as a starting point.
> But as Bruno suggested, some actual compile time measurments are probably in order, just to confirm.
I'm running LNT tests with ARM/AArch64 compiler without and with this patch, and will report results when finished.

================
Comment at: lib/Transforms/Scalar/DeadStoreElimination.cpp:808
@@ +807,3 @@
+      continue;
+    if (!DT->isReachableFromEntry(Pred) || DT->dominates(BB, Pred))
+      continue;
----------------
mbodart wrote:
> I still don't understand why we need to check DT->dominates(BB, Pred) here.
> I'm guessing you're trying to guard against an infinite loop, which of course
> we need to do.  But I think that would be automatically handled by instead checking
> for !SafeBlocks.count(Pred).  In addition to allowing the search to proceed further, this
> would eliminate the need for the special check of Pred == BB, since BB must be in SafeBlocks.
We need  DT->dominates(BB, Pred) check to avoid situations similar to the one in cycle.ll test below. This test is a simplified version of a larger function found in a SPEC benchmark.

================
Comment at: lib/Transforms/Scalar/DeadStoreElimination.cpp:814
@@ +813,3 @@
+      BasicBlock *Succ = *II;
+      if (Succ == BB || Succ == Pred)
+        continue;
----------------
mbodart wrote:
> The Succ == BB check is correct, but redundant as SafeBlocks.count(BB) must be non-zero. 
> I think the code would read better if this and the next check were combined into one:
> 
> if (Succ != Pred && !SafeBlocks.count(Succ))
> 
> but would be OK with your current structure.
> Either way, however, it would be useful to add a clarifying source comment to the effect:
> 
> // Pred is only known to be optimizeable if all of its successors are either safe or are Pred itself.
Agree that Succ == BB check is redundant, but it is a shortcut compared to checking a membership in a set.
That will save some compile time.
Sure, I will add the comment you suggested. 

================
Comment at: lib/Transforms/Scalar/DeadStoreElimination.cpp:892
@@ +891,3 @@
+      // Don't remove a store within single-block loops;
+      // we need more analysis: e.g. looking for an interferring load
+      // above the store within the loop, etc.
----------------
mbodart wrote:
> ivanbaev wrote:
> > The idea is to allow the search continue through simple loops, but not to remove stores within loops. 
> Sorry, I'm not sure which comment you were responding to.
> Phabricator for some reason seems to be placing some of my comments
> above the lines of interest, and others below.
> 
> Regarding my "loop invariant" comment, I'm saying that the test for whether the current block
> is a single-block loop (getNumSuccessors() == 2, etc) should probably be done outside of this loop.  I'm also not sure if checking explicitly for two successors is sufficient.  A program could conceivably have a switch statement with one case having a goto back to a label just prior to the switch.  I'm not familiar enough with LLVM optimizations to know whether that can lead to an indirect branch successor which is the block containing the indirect branch itself.  But it's theoretically possible.  My recommendation here is to use a successor iterator, to match the behavior present in FindSafePreds.
I will use a successor iterator here, thanks.


http://reviews.llvm.org/D13363





More information about the llvm-commits mailing list