[PATCH] D78932: [DSE,MSSA] Relax post-dom restriction for objs visible after return.

Wed May 6 19:20:14 PDT 2020

george.burgess.iv added a comment.

Thanks for the patch :)

I haven't reasoned through this fully, but the overall idea of "mark all killing blocks, then walk from the exits and see that all of them hit any of the killing blocks (or never pass through the block with the store)" seems functionally correct to me. I have a first round of comments/questions; happy to dig deeper on whatever direction we decide to head in.

> do you think this patch is along the lines you suggested

What I envisioned was more or less keeping a `SmallPtrSet<MemoryAccess *, N>` populated with potentially-terminating Defs/Phis (e.g., for each Def/Phi in there, there exists a path in the function such that that's the last Def/Phi before the function exits). When walking for memory which is readable after the function exits, if we would call `PushMemUses` on something in this set, we give up. This can set can be built once lazily, and then queried for the rest of the life of `run()`. We could build it by {B,D}FS'ing starting at all exit blocks for our current `Function`, and terminating the search each time we discover any MemoryDefs/MemoryPhis in a block (we have special use-lists for these, so this is a constant-time query per block).

If there aren't holes in it, I think that approach would be generally faster/simpler, but it also introduces the problem of cache invalidation. Cache invalidation makes many things harder, including writing tests + making test-cases. :)

Assuming the development practice we're trying to follow is still "build what's needed, get feedback, iterate," that you mentioned on an earlier review, I'm happy to chalk that up as a future optimization if you'd prefer this approach.

> if the post order number of a block is greater than the one
>  of DomAccess, the block cannot be in in a path starting at DomAccess

I'm not certain I'm interpreting "a path starting at," correctly, but the way I'm reading it, I don't agree:

    A
    |
    B<-----
   / \    |
  C   D   |
      |   |
      E --|

(where all edges are downward except E -> B)

C = 1, E = 2, D = 3, B = 4, and A = 5, yet there's a path from E -> B, no? Admittedly, I'm unsure at this point if this is pedantry on my part, or if there are correctness implications on the algorithm as presented.

================
Comment at: llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp:1659

-      // Check if we can skip DomDef for DSE. For accesses to objects that are
-      // accessible after the function returns, KillingDef must execute whenever
-      // DomDef executes and use post-dominance to ensure that.
+      // Check if we can skip DomDef for DSE. Accesses to objects that are
       MemoryDef *DomDef = dyn_cast<MemoryDef>(DomAccess);
----------------
> Accesses to objects that are

was this accidentally left in?

================
Comment at: llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp:1740
       if (MemoryDef *UseDef = dyn_cast<MemoryDef>(UseAccess)) {
-        if (!isCompleteOverwrite(DefLoc, UseInst))
+        if (isCompleteOverwrite(DefLoc, UseInst)) {
+          int64_t InstWriteOffset, DepWriteOffset;
----------------
nit: looks like we're only using `KillingBlocks` if `DefVisibleToCallerAfterRet`. Can we save work in the `!DefVisibleToCallerAfterRet` case by ignoring this bit of the `if` then?

================
Comment at: llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp:1748
+          if (UseAccess != DomAccess && CC &&
+              isOverwrite(DefLoc, *CC, DL, TLI, DepWriteOffset, InstWriteOffset,
+                          UseInst, IOL, AA, &F) == OW_Complete) {
----------------
it's a bit odd to read 

```
if (isCompleteOverwrite(...)) {
  if (isOverwrite(...) == OW_Complete) {
    // ...
  }
}
```

Is there a subtle difference between the two that I'm missing?

================
Comment at: llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp:1781
+      // post-dominates DomAccess.
+      if (KillingBlocks.find(CommonPred) != KillingBlocks.end()) {
+        if (PDT.dominates(CommonPred, DomAccess->getBlock()))
----------------
nit: for consistency with elsewhere, can we `.count`?

================
Comment at: llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp:1787
+
+      //  If the common post-dominator does not post-dominates DomAccess, there
+      //  is a path from DomAccess to an exit not going through a killing block.
----------------
tiny nits: 

- one space between `//` and the comment please
- also, `s/post-dominates/post-dominate/`

================
Comment at: llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp:1802
+        else
+          for (BasicBlock *R : PDT.getRoots()) {
+            if (!DT.isReachableFromEntry(R))
----------------
i'm unsure how we reach this case. AFAICT, `PDT.dominates(nullptr, AnythingElse) == false`, since the PDT should fail to lookup the node for `nullptr`, no?

================
Comment at: llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp:1825
+
+          if (WorkList.size() >= 50)
+            return None;
----------------
nit: 50 is a bit of a magical number to give up at here. is it worth making a `-mllvm` option (or perhaps hoisting to a constant with a bit of prose about how it was chosen)?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D78932/new/

https://reviews.llvm.org/D78932