[PATCH] D72700: [DSE] Add first version of MemorySSA-backed DSE (Bottom up walk).

Florian Hahn via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Feb 3 07:43:38 PST 2020


fhahn added inline comments.


================
Comment at: llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp:1472
+  if (auto CS = CallSite(DI))
+    if (CS.onlyAccessesInaccessibleMemory())
+      return true;
----------------
asbirlea wrote:
> MemorySSA should handle this, if it doesn't we should fix that.
it's handled by MemorySSA, I've dropped that.


================
Comment at: llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp:1539
+  // Check if DomDef may be read.
+  for (unsigned I = 0; I < WorkList.size(); I++) {
+    MemoryAccess *UseAccess = WorkList[I];
----------------
asbirlea wrote:
> fhahn wrote:
> > asbirlea wrote:
> > > Walking uses may miss cases due to aliasing not being transitive. This needs to be throughly analyzed. Here's a very rough example.
> > > ```
> > > 1 = Def(LoE) ; <----- DomDef  stores [0,3]
> > > 2 = Def(1) ; (2, 1)=NoAlias,   stores [4,7]
> > > Use(2)      ; MayAlias 2 *and* 1, the Use points to the *first* Def it may alias, loads [0, 7].
> > > 3 = Def(1)  ; <---- Current  (3, 2)=NoAlias, (3,1)=MayAlias,  stores [0,3]
> > > ```
> > > The situation may be simplified due to handling stores, but all Uses may need looking at. Note this will not work to recurse on uses of the uses either.
> > > Rough example why:
> > > ```
> > > 1 = Def(LoE)
> > > 2 = Def(1)     ; <----- DomDef 
> > > 3 = Def(1) ; (3, 2)=NoAlias
> > > Use(3)      ; MayAlias 3 *and* 2, the Use points to the *first* Def it may alias.
> > > 4 = Def(2)  ; <---- Current  (4, 3)=NoAlias, (4,2)=MayAlias
> > > ```
> > > 
> > Thanks for the example. I think I might be missing something for the optimized version though. From the example in MemorySSA.h:
> > 
> > ```
> > /// Given this form, all the stores that could ever effect the load at %8 can be
> > /// gotten by using the MemoryUse associated with it, and walking from use to
> > /// def until you hit the top of the function.
> > ```
> > 
> > From the comment above, shouldn't we visit both  2 and 3 when walking up from Use(3), as both may change the read location? In the example we would only see 3 and 1.
> > 
> > But even assuming we would visit both 2 and 3, I think I can see how we could end up with scenarios we could miss overlapping reads. I think we would have to take a look at all users of DomDef and we specifically *cannot* skip any non-aliasing MemoryDefs for the read-checks. 
> > 
> > That would make things more expensive, but would be something we have to do regardless of going bottom-up/top-down. Does that make sense to you? However I think that would mean that in the worst-case, we would have to do a top-down walk similar to the general top-down approach for the read checks.
> I think I know what I missed. MemoryDefs keep two fields, the defining and the optimized access. So `3 = Def(1) ` actually looks like this: `3 = Def(2) - Optimized 1 `, and is a user of both 1 and 2.
> 
> Yes, I think you're right that the read checks for all accesses are needed regardless of which approach is taken. And yes, it will be expensive. 
> I came across something similar in LICM, and I limited or outright avoided analyzing all uses against a store to avoid the cost of analyzing all of them (see ~LICM.cpp:1300)
I've added a test for the scenario: overlapping_read in multiblock-simple.ll


================
Comment at: llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp:1541
+    MemoryAccess *UseAccess = WorkList[I];
+    // Bail out on MemoryPhis for now.
+    if (isa<MemoryPhi>(UseAccess))
----------------
asbirlea wrote:
> You're right this can be partially lifted.
> Here's an example:
> ```
> %a:
>   1 = Def(LoE)  ; <----- DomDef 
>    br %cond1, %b, %c
> %b:
>   2 = Def(1)
>    br %cond2, %d, %e
> %c:
>    br %e
> %d:
>    br %f
> %e:
>    3 = phi(1,2)
>    br %f
> %f:
>    4 = Phi(2,3)
>    5 = Def(4) ; <---- Current
> ```
>   
Restrictions related to MemoryPhis are lifted in D72148


================
Comment at: llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp:1463
+    DSEState State(F, AA, MSSA, DT, PDT, TLI);
+    // Collect blocks with throwing instructions not modeled in MemroySSA and
+    // alloc-like objects.
----------------
asbirlea wrote:
> s/modeled/modelled
> s/MemroySSA/MemorySSA
I'm not a native speaker, but I think modeled is the US spelling and modelled is the UK spelling. I thought we prefer the US spelling, but I don't mind either way.


================
Comment at: llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp:1465
+    // alloc-like objects.
+    for (Instruction &I : instructions(F)) {
+      if (I.mayThrow() && !MSSA.getMemoryAccess(&I))
----------------
asbirlea wrote:
> Consider adding a cl::opt limit on the things you look to process in F. We've had this issue with generated code where a BB has order of thousands stores.
I've added a rather generous limit for Defs per BB, but we can always adjust it later. We still have to check the whole basic block for throwing instructions.


================
Comment at: llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp:1592
+
+    SmallVector<MemoryAccess *, 4> WorkList;
+    auto PushMemUses = [&WorkList](MemoryAccess *Acc) {
----------------
asbirlea wrote:
> SmallSetVector to avoid processing duplicates?
Done. I *think* in the initial version the only nodes we might add multiple times are MemoryPhis and we bail out on the first one we see. I've added such a test to llvm/test/Transforms/DeadStoreElimination/MSSA/multiblock-memoryphis.ll


================
Comment at: llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp:1643
+        PushMemUses(UseDef);
+        if (isWriteClobber(DefLoc, UseInst)) {
+          LLVM_DEBUG(dbgs() << "  ... may-write clobber\n");
----------------
asbirlea wrote:
> Can this happen? Wouldn't the getClobbering above have found UseAccess /UseInst instead of DomDef?
I think it can happen in loops where a store in a the header is killed by a store in the exit, but it is also stored to in the loop body. I've added a few additional test cases to llvm/test/Transforms/DeadStoreElimination/MSSA/multiblock-loops.ll loop_multiple_def* .

I've also moved the post dominance check in into the function and also updated the check here to skip PushMemUses for MemoryDefs that completely overwrite DefLoc. This helps with avoiding unnecessary checks.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D72700/new/

https://reviews.llvm.org/D72700





More information about the llvm-commits mailing list