[PATCH] D73763: [DSE] Lift post-dominance restriction.
George Burgess IV via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Apr 1 19:04:52 PDT 2020
george.burgess.iv added a comment.
> I think there are 2 cases to distinguish:
>
> 1. For accesses to non-alloca objects, this matches what the intrinsic achieves I think. We can only eliminate stores to objects that are visible after the function returns, if they are overwritten along all paths to the exit. So if we have determined a set of blocks that overwrite A (and there are no reads in between), we could check if all paths to the exit from A must go through one of the overwriting blocks. I think that matches your suggestion.
Sounds like what I was thinking, yeah.
> For objects that are not visible to the caller (e.g. alloca), the intrinsic achieves something slightly different: Given A and B as above, we can eliminate A, if there are no reads of A between A and any of the exits from A. We can stop walking any paths that contain overwrites of A.
I'm confused about why `alloca`s need special treatment. This new set that we're building is essentially a fast way to answer the question "are there potentially unknowable reads after this `Def`/`Phi`?" If we know memory (e.g., from an `alloca`) is unreadable outside of the function, I'd hope we can ignore the set entirely, like this patch is doing now? Baked into my suggestion was the assumption that we were already:
> check[ing] all potentially aliasing access along all paths to the exit
...since I think the transitive `User` walking we're doing inside of `getDomMemoryDef` should accomplish that?
In any case, if you're confident that this works for #1, I'm content to move discussion about "why don't things work as they are now," to a patch that doesn't block this one.
> The approach I tried to follow to bring up MemoySSA-backed DSE is to add support for additional cases in smallish steps to have a baseline (and compile-time is controlled by cut-offs) and subsequently improve the parts that case issues in practice. For example, lift the post-dominance restriction with potentially excessive walking, followed by patches to implement the special handling for the non-alloca case as suggested and future performance improvements (e.g. there is plenty of potential for caching, like in D75025 <https://reviews.llvm.org/D75025>)? That would also allow to quantify the impact of improvements. What do you think?
I like incremental approaches when they're possible, so I'm happy with this. :)
================
Comment at: llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp:1682
for (Use &U : Acc->uses())
WorkList.insert(cast<MemoryAccess>(U.getUser()));
};
----------------
fhahn wrote:
> george.burgess.iv wrote:
> > tangential to this patch: should we be ignoring `U`s which are `MemoryDefs` whose only Use of `Acc` is as their `getOptimized()` operand? Without that, it's not immediately obvious to me how we aren't overly conservative in situations like:
> >
> > ```
> > {
> > // 1 = MemoryDef(LOE) -- this is DomAccess
> > *a = 1;
> > // 2 = MemoryDef(1)
> > *b = 2;
> > // 3 = MemoryDef(2) -- this is the obvious kill
> > *a = 3;
> > // 4 = MemoryDef(3) (optimized to 2)
> > *c = 4;
> > // MemoryUse(4) -- AA says MayAlias for `(d, a)`, so `isReadClobber()` should be `true` if we visit this?
> > if (*d) {
> > // ...
> > }
> > }
> > ```
> Oh, the optimized access is also added to the uses of `2 = MemoryDef` here? Yes that would be overly conservative. Is there a way to force MemorySSA to be optimized before we run DSE, for testing? I'll put up a separate fix.
> Is there a way to force MemorySSA to be optimized before we run DSE, for testing?
I'm unsure offhand. @asbirlea would likely know if there is.
If there's not and you'd find it useful, a debugging-only pass that boils down to `for (MemoryAccess *MA : Fn) { MSSA->getWalker()->getClobberingMemoryAccess(MA, {}); }` should force that. We preoptimize `Use`s, but not `Def`s.
================
Comment at: llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp:1492
+ // Set of the last MemoryDefs or MemoryPhis before a function exit, that is
+ // defs or phis without any users that are defs or phis.
+ SmallPtrSet<MemoryAccess *, 8> LastDefOrPhi;
----------------
> that is defs or phis without any users that are defs or phis.
This isn't quite complete; consider:
```
void bar(); // inaccessiblememonly
void foo(int *a) {
// 1 = MemoryDef(LOE)
*a = 1;
if (bar())
return;
// 2 = MemoryDef(1)
*a = 2;
}
```
Despite having `2` as a `User`, we'd still want `1` to be in this set, since depending on the way the `if` goes, it's the last Def/Phi before the function ends
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D73763/new/
https://reviews.llvm.org/D73763
More information about the llvm-commits
mailing list