[PATCH] D13363: [DeadStoreElimination] Add support for DSE across blocks

Mitch Bodart via llvm-commits llvm-commits at lists.llvm.org
Tue Oct 13 15:59:42 PDT 2015


mbodart added a comment.

In general I think your approach of using a reverse CFG search for this optimization is appropriate.  But as Erik points out it doesn't handle if-then-else.
That can be addressed relatively easily:

Let StartBB be the block with the original StoreInst.
Let SafeBlocks be the set of blocks for which we know all paths emanating from them will lead to StoreInst
with no intervening reads of the memory location of interest.  Initially, add StartBB to SafeBlocks.

The criteria for adding a new predecessor P to your WorkList are that P is reachable, P is not in SafeBlocks (not visited),
and all of P's successors are P itself (the single-block loop case) or are in SafeBlocks.

Seed your WorkList with the predecessors of StartBB that meet this criteria.
After processing a block B in HandleNonLocalDependency, when DependencyFound is false,
add B to SafeBlocks, and augment the WorkList with B's predecessors that meet the criteria.

The "P is not in SafeBlocks" acts like a "! Visited" check, except that terminal (unsafe) blocks may be
visited multiple times.  To avoid that redundancy we'd need another set (UnsafeBlocks, or perhaps just VisitedBlocks).

As to the compile time threshold, I think it needs to be applied to the number of dependences and blocks seen
in HandleNonLocalDependency.  That way we'll put a cap on the cost of optimizing an individual store,
rather than limiting the number of store instructions that may get optimized.  I don't know what LLVM's
philosophy is on adding new command line options, but perhaps the threshold value should be overridable
that way.  And perhaps a cl::opt<bool> EnableGDSE("enable-gdse"...) to control this global dead store elimination.

As for the while loop in runOnBasicBlock, I would prefer that it keep track of whether any reads were seen,
including the isSelfRead case, and then we handle non-local dependences after that loop terminates.
This is the same concept as your DependencySeen tracking in HandleNonLocalDependency, though I might
name it SafeToProcessNonLocalDeps.


================
Comment at: lib/Transforms/Scalar/DeadStoreElimination.cpp:857
@@ +856,3 @@
+
+    MemDepResult Dep = MD->getPointerDependencyFrom(Loc, false, InstPt, PB, SI);
+    while (Dep.isDef() || Dep.isClobber()) {
----------------
Instead of InstPt, shouldn't the search start with PB->end()?
It looks like getPointerDependencyFrom does a "--" operation on this iterator
to get the first instruction to start its analysis from.

================
Comment at: lib/Transforms/Scalar/DeadStoreElimination.cpp:900
@@ +899,3 @@
+      }
+      DependencyFound = true;
+      break;
----------------
I don't understand why we are stopping the search here.
If the overwrite is partial, there may still be preceding stores
that could be eliminated.

================
Comment at: lib/Transforms/Scalar/DeadStoreElimination.cpp:906
@@ +905,3 @@
+    // to the set of blocks to search for redundant stores.
+    if (!DependencyFound)
+      FindUncondPredsIncludingSimpleLoops(Blocks, PB, DT);
----------------
Shouldn't this check also include && Dep.isNonLocal()?


http://reviews.llvm.org/D13363





More information about the llvm-commits mailing list