[llvm] [DeadStoreElimination] Optimize tautological assignments (PR #75744)

Shreyansh Chouhan via llvm-commits llvm-commits at lists.llvm.org
Thu Dec 21 07:37:24 PST 2023


================
@@ -1926,6 +1926,43 @@ struct DSEState {
       if (InitC && InitC == StoredConstant)
         return MSSA.isLiveOnEntryDef(
             MSSA.getSkipSelfWalker()->getClobberingMemoryAccess(Def, BatchAA));
+
+      if (!Store)
+        return false;
+
+      // If there is a dominating condition, that ensures that the value
+      // being stored in a memory location is already present at the
+      // memory location, the store is a noop.
+      BasicBlock *StoreBB = DefI->getParent();
+      auto *StorePtr = Store->getOperand(1);
+
+      DomTreeNode *IDom = DT.getNode(StoreBB)->getIDom();
+      if (!IDom)
+        return false;
+
+      auto *TI = IDom->getBlock()->getTerminator();
+      ICmpInst::Predicate Pred;
+      BasicBlock *TrueBB, *FalseBB;
+
+      if (!match(TI, m_Br(m_ICmp(Pred, m_Load(m_Specific(StorePtr)),
+                                 m_Specific(StoredConstant)),
+                          TrueBB, FalseBB)))
+        return false;
+
+      MemoryAccess *LastMod =
+          MSSA.getSkipSelfWalker()->getClobberingMemoryAccess(Def, BatchAA);
+
+      DomTreeNode *CDom = DT.getNode(LastMod->getBlock());
----------------
BK1603 wrote:

What is even more interesting is, even if we return the value stored at ptr x, we still optimize it in a similar manner (which I don't feel is the current behavior)

Say if we take:
```
@x = dso_local local_unnamed_addr global i32 0, align 4

; Function Attrs: mustprogress nofree norecurse nosync nounwind willreturn memory(readwrite, argmem: none, inaccessiblem
em: none) uwtable
define dso_local i32 @foo() local_unnamed_addr #0 {
  %1 = load i32, ptr @x, align 4, !tbaa !3
  store i32 7, ptr @x, align 4, !tbaa !3
  %2 = icmp eq i32 %1, 4
  br i1 %2, label %3, label %4

3:                                                ; preds = %0
  store i32 4, ptr @x, align 4, !tbaa !3
  br label %4

4:                                                ; preds = %3, %0
  %5 = load i32, ptr @x, align 4, !tbaa !3
  ret i32 %5
}
```

The code is optimized to:
```
@x = dso_local local_unnamed_addr global i32 0, align 4

; Function Attrs: mustprogress nofree norecurse nosync nounwind willreturn memory(readwrite, argmem: none, inaccessiblemem: none) uwtable
define dso_local i32 @foo() local_unnamed_addr #0 {
  %1 = load i32, ptr @x, align 4, !tbaa !3
  %2 = icmp eq i32 %1, 4
  %spec.store.select = select i1 %2, i32 4, i32 7
  store i32 %spec.store.select, ptr @x, align 4
  ret i32 %spec.store.select
}
```

Both these functions do not behave the same, in the unoptimized version, the function would always return 7, (which is what x would hold by the end of the function,) in the optimized version however the return value would depend upon the value of x at the time of function entry. 

https://github.com/llvm/llvm-project/pull/75744


More information about the llvm-commits mailing list