[PATCH] D11143: [RFC] Cross Block DSE

Tue Aug 4 08:42:59 PDT 2015

dberlin added a comment.

LGTM with these comments

================
Comment at: lib/Transforms/Scalar/DeadStoreElimination.cpp:101
@@ +100,3 @@
+      PDT = &getAnalysis<PostDominatorTree>();
+      if (PDT->getRootNode()) {
+        int Count = PDT->getRootNode()->getDFSNumOut();
----------------
Do you have a testcase for where PDT->getRootNode() is null?

because that seems broken :)

================
Comment at: lib/Transforms/Scalar/DeadStoreElimination.cpp:108
@@ +107,3 @@
+        // If we have more than 1 block try to populate candidate store.
+        if (Count > 1)
+          populateCandidateStores(F);
----------------
So, if we only have 1 block, why are you doing the rest of the work here, instead of returning false and exiting?
:)

================
Comment at: lib/Transforms/Scalar/DeadStoreElimination.cpp:515
@@ +514,3 @@
+      if (!PDT->getRootNode())
+        continue;
+      if (StoreInst *SI = dyn_cast<StoreInst>(Inst))
----------------
Much like the one above, i don't understand why this PDT->getRootNode() check is here.

The PDT should always have a root node, fake or not.
If you've got cases it doesn't, it seems like we should fix PDT

================
Comment at: lib/Transforms/Scalar/DeadStoreElimination.cpp:1001
@@ +1000,3 @@
+        uint64_t MemSize = AA->getTypeStoreSize(MemPtr->getType());
+        AliasResult R = AA->alias(Pointer, PtrSize, MemPtr, MemSize);
+        if (R == MustAlias)
----------------
hfinkel wrote:
> karthikthecool wrote:
> > hfinkel wrote:
> > > If I'm reading this correctly, this call to AA->alias is going to essentially re-do all of the same analysis done in the AA->getModRefInfo call above. Could we check whether I is a StoreInst first and only to one query on store instructions instead of two?
> > > 
> > Hi Hal,
> > The first check ensures that we see all instructions that may modify/ref a pointer (this is the pointer the store to which we are planning to delete). In case we find an instruction that refers to the pointer e.g. a load we then return false indicating that the store cannot be removed.
> > 
> > If the instruction is a store we can safely remove the 1st store if and only if both the store and the candidate MustAlias. To detect this i had to check the alias again.
> > Thanks and Regards
> > Karthik
> I understand that the interface here is less than ideal, but could you not skip the getModRefInfo call entirely if `I` is a StoreInst, and just use the result from AA->alias instead? In other words, hoist the dyn_cast<StoreInst>-based check above the ModRef logic?
> 
As Karthik says, i'm not sure there is a way around this.
You can get better time bounds with PRE-like algorithms, but the underlying work here has to be done.

The getModRefInfo is like finding your data dependences.   For stores they may not be mustalias.   If they are mod/ref, they block removal because they may read or write the memory from the first store before the second store does the same thing, so you can't remove the first store.

IE given
Store of 1
Load of 1
Store of 1

You can't remove the first store.

(You could actually see if it's legal to sink everything  below the second store to enable you to delete the first,  but that is effectively global code motion. not hard, just a different algorithm with it's own issues.  You could also see if it's possible to delete the second store instead of the first, but this is hard in the simple cross-block DSE we are doing here)

Same with
Store of 1
may-store of 1
Load of 1
Store of 1

Now, interestingly, given:

Store of 1
may-store of 1
Store of 1

You can still eliminate the first store.
In fact, i'm struggling to come up with a case where Mod without ref matters if you have a later mustalias store.
It's early, but i think it's safe to eliminate store over store  with only a may-mod store in the way.

Karthik, do you have a counter-example?

If not, i'd expect the check on line 960 above to take this into account (IE be okay with nomodref and mod, but not ref or modref)

But as you can see, even if there is nothing "in the way" between the first and second store, you still have to check whether you *mustalias* the second store to know that you can remove the first one.

SSA based store-PRE algorithms basically do this "not on the fly". They  build memory use chains so they don't have to walk blocks (sound familiar? :P).  They still have to do the mod-ref checks and mustalias  checks, however.  They are just structured in a way that it is not N^2 to figure out placement.

http://reviews.llvm.org/D11143